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SEQ ID NO:l 
Size: 410 
DNA--BAP-1 

1 gcccgttgtc tgtgtgtggg actgaggggc cccgggggcg gtgggggctc ccggtggggg 
61 cagcggtggg gagggagggc ctggacatgg cgctgagggg ccgccccgcg ggaagatgaa 
121 taagggctgg ctggaagctgg agagcgaccc aggcctcttc accctgctcg tggaagattt 
181 cggtgtcaag ggggtgcaag tggaggagat ctacgacctt cagagcaaat gtcagggccc 
241 tgtatatgga tttatcttcc tgttcaaatg gatcgaagag cgccggtccc ggcgaaaggt 
301 ctctaccttg gtggatgata cgtccgtgat tgatgatgat attgtgaata acatgttctt 
361 tgcccaccag ctgataccca actcttgtgc aactcatgcc ttgctgagcg tgctcctgaa 
421 ctgcagcagc gtggacctgg gacccaccct gagtcgcatg aaggacttca ccaagggttt 
481 cagccctgag agcaaaggat atgcgattgg caatgccccg gagttggcca aggcccataa 
541 tagccatgcc aggcccgagc cacgccacct ccctgagaag cagaatggcc ttagtgcagt 
601 gcggaccatg gaggcgttcc actttgtcag ctatgtgcct atcacaggcc ggctctttga 
661 gctggatggg ctgaaggtbt accccattga ccatgggccc tggggggagg acgaggagtg 
721 gacagacaag gcccggcggg tcatcatgga gcgtatcggc ctcgccactg caggggagcc 
781 ctaccacgac atccgcttca acctgatggc agtggtgccc gaccgcagga tcaagtatga 
841 ggccaggctg catgtgctga aggtgaaccg tcagacagta ctagaggctc tgcagcagct 
901 gataagagta acacagccag agctgattca gacccacaag tctcaagagt cacagctgcc 
961 tgaggagtcc aagtcagcca gcaacaagtc cccgctggtg ctggaagcaa acagggcccc 
1021 tgcagcctct gagggcaacc acacagatgg tgcagaggag gcggctggtt catgcgcaca 
1081 agccccatcc cacagccctc ccaacaaacc caagctagtg gtgaagcctc caggcagcag 
1141 cctcaatggg gttcacccca accccactcc cattgtccag cggctgccgg cctttctaga 
1201 caatcacaat tatgccaagt cccccatgca ggaggaagaa gacctggcgg caggtgtggg 
1261 ccgcagccga gttccagtcc gcccacccca gcagtactca gatgatgagg atgactatga 
1321 ggatgacgag gaggatgacg tgcagaacac caactctgcc cttaggtata aggggaaggg 
13 81 aacagggaag ccaggggcat tgagcggttc tgctgatggg caactgtcag tgctgcagcc 
1441 caacaccatc aacgtcttgg ctgagaagct caaagagtcc cagaaggacc tctcaattcc 
1501 tctgtccatc aagactagca gcggggctgg gagtccggct gtggcagtgc ccacacactc 
1561 gcagccctca cccaccccca gcaatgagag tacagacacg gcctctgaga tcggcagtgc 
1621 tttcaactcg ccactgcgct cgcctatccg ctcagccaac ccgacgcggc cctccagccc 
1681 tgtcacctcc cacatctcca aggtgctttt tggagaggat gacagcctgc tgcgtgttga 
1741 ctgcatacgc tacaaccgtg ctgtccgtga tctgggtcct gtcatcagca caggcctgct 
1801 gcacctggct gaggatgggg tgctgagtcc cctggcgctg acagagggtg ggaagggttc 
1861 ctcgccctcc atcagaccaa tccaaggcag ccaggggtcc agcagcccag tggagaagga 
1921 ggtcgtggaa gccacggaca gcagagagaa gacggggatg gtgaggcctg gcgagccctt 
1981 gagtggggag aaatactcac ccaaggagct gctggcactg ctgaagtgtg tggaggctga 
2041 gattgcaaac tatgaggcgt gcctcaagga ggaggtagag aagaggaaga agttcaagat 
2101 tgatgaccag agaaggaccc acaactacga tgagttcatc tgcaccttta tctccatgct 
2161 ggctcaggaa ggcatgctgg ccaacctagt ggagcagaac atctccgtgc ggcggcgcca 
2221 aggggtcagc atcggccggc tccacaagca gcggaagcct gaccggcgga aacgctctcg 
2281 cccctacaag gccaagcgcc agtgaggact gctggccctg actctgcagc ccactcttgc 
2341 cgtgtggccc tcaccagggt ccttccctgc cccacttccc cttttcccag tattactgaa 
2401 tagtcccagc tggagagtcc aggccctggg aatgggagga accaggccac attccttcca 
2461 tcgtgccctg aggcctgaca cggcagatca gccccatagt gctcaggagg cagcatctgg 
2521 agttggggca cagcgaggta ctgcagcttc ctccacagcc ggctgtggag cagcaggacc 
2581 tggcccttct gcctgggcag cagaatatat attttaccta tcagagacat ctatttttct 
2641 gggctccaac ccaacatgcc accatgttga cataagttcc tacctgacta tgctttctct 
2701 cctaggagct gtcctggtgg gcccaggtcc ttgtatcatg ccacggtccc aactacaggg 
2761 tcctagctgg gggcctgggt gggccctggg ctctgggccc tgctgctcta gccccagcca 
2821 ccagcctgtc cctgttgtaa ggaagccagg tcttctctct tcattcctct taggagagtg 
2881 ccaaactcag ggacccagca ctgggctggg ttgggagtag ggtgtcccag tggggttggg 
2941 gtgagcaggc tgctgggatc ccatggcctg agcagagcat gtgggaactg ttcagtggcc 
3001 tgtgaactgt cttccttgtt ctagccaggc tgttcaagac tgctctccat agcaaggttc 
3061 tagggctctt cgccttcagt gttgtggccc tagctatggg cctaaattgg gctctaggtc 
3121 tctgtccctg gcgcttgagg ctcagaagag cctctgtcca gcccctcagt attaccatgt 
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3181 ctccctctca ggggtagcag agacagggtt gcttatagga agctggcacc actcagctct 

3241 tcctgctact ccagtttcct cagcctctgc aaggcactca gggtggggga cagcaggatc 

3301 aagacaaccc gttggagccc ctgtgttcca gaggacctga tgccaagggg taatgggccc 

3361 agcagtgcct ctggagccca ggccccaaca cagccccatg gcctctgcca gatggctttg 

34 21 aaaaaggtga tccaagcagg cccctttatc tgtacatagt gactgagtgg ggggtgctgg 

3481 caagtgtggc agctgcctct gggctgagca cagcttgacc cctctagccc ctgtaaatac 

3541 tggatcaatg aatgaataaa actctcctaa gaatctcctg agaaaaaaaa aaaaaaaaa 

SEQ ID NO:2 
Size: 729 
PRT--BAP-1 

MNKGWLELESDPGLFTLLVEDFGVKGVQVEEIYDLQSKCQGPVYGFIFLFKWIEERRSRRKVSTbVDDTSVIDDD 
IVNNMFFAHQLIPNSCATHALLSVLLNCSSVD^ 

HLPEKQNGLSAVRTMEAFHFVSYVPITGRLFELIX3LKVYPIDHGPWGEDE 

I RFNLMA WPDRR I KYEARLHVLKVNRQTVLEALQQIj IRVTQPELIQTOKSQESQLPEESKSASNKSPIiVLEANR 
APAASEGNHTDGAEEAAGSCAQAPSHS PPNKPKLWKPPGSSLNGVHPNPTPI VQRLPAFLDNHNYAKS PMQEEE 
DLAAGVGRSRVPVRPPQQYSDDEDDYEDDEEDDVQNTNSALRYKGKGTGKPGALSGSATC 

KLKESQKDLSIPLSIKTSSGAGSPAVAVPTHSOPSPTPSNESTDTASEIGSAFNSPLRSPIRSANPTRPSSPVTS 

HISKVLFGEDDSLLRVDCIRYNRAVRDIXSPVISTGLM^ 

KEVVEATDSREKTGMVRPGEPLSGEKYSPKELLALLKCV^ 

CTFISMLAQEGMLANLVEQNISVRRRQGVSIGRLHKQRKPDRRKRSRPYKAKRQ 
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SEQ ID NO:3 

Size: 437 
DNA--NP95 

1 CGACTCCTTA GAG C ATGGC A TGGCTCAGAG GTGCTGGTAA AACTGATGGG GGTTTTTGCT 
61 GTCCCTCCCC TCAGCGCCGA CACCATGTGG ATCCAGGTTC GGACCATGGA CGGGAGGCAG 
121 ACCCACACGG TGGACTCGCT GTCCAGGCTG ACCAAGGTGG AGGAGCTGAG GCGGAAGATC 
181 CAGGAGCTGT TCCACGTGGA GCCAGGCCTG CAGAGGCTGT TCTACAGGGG CAAACAGATG 
241 GAGGACGGCC ATACC CTCTT CGACTACGAG GTCCGCCTGA ATGACACCAT CCAGCTCCTG 
3 01 GTCCG CC AG A GCCTCGTGCT CCCCCACAGC ACCAAGGAGC GGGACTCCGA GCTCTCCGAC 
361 ACCGACTCCG GCTGCTGCCT GGGCCAGAGT GAGTCAGACA AGTCCTCCAC CCACGGCGAG 
421 GCGGCCGCCG AGACTGACAG CAGGCCAGCC GATGAGGACA TGTGGGATGA GACGGAATTG 
481 GGGCTGTACA AGGTCAATGA GTACGTCGAT GCTCGGGACA CGAACATGGG GGCGTGGTTT 
541 GAGGCGCAGG TGGTCAGGGT GACGCGGAAG GCCCCCTCCC GGGACGAGCC CTGCAGCTCC 
601 ACGTCCAGGC CGGCGCTGGA GGAGGACGTC ATTTACCACG TGAAATACGA CGACTACCCG 
661 GAGAACGGCG TGGTCCAGAT GAACTCCAGG GACGTCCGAG CGCGCGCCCG CACCATCATC 
721 AAGTGGCAGG ACCTGGAGGT GGGCCAGGTG GTCATGCTCA ACTACAACCC CGACAACCCC 
781 AAGGAGCGGG GCTTCTGGTA CGACGCGGAG ATCTCCAGGA AGCGCGAGAC CAGGACGGCG 
841 CGGGAACTCT ACGCCAACGT GGTGCTGGGG GATGATTCTC TGAACGACTG TCGGATCATC 
901 TTCGTGGACG AAGTCTTCAA GATTGAGCGG CCGGGTGAAG GGAGCCCCAT GGTTGACAAC 
961 CCCATGAGAC GGAAGAGCGG GCCGTCCTGC AAGCACTGCA AGGACGACGT GAACAGACTC 
1021 TGCCGGGTCT GCGCCTGCCA CCTGTG CGGG GGCCGGCAGG ACCCCGACAA GCAGCTCATG 
1081 TGCGATGAGT GCGAC ATGGC CTTCCACATC TACTGCCTGG ACCCGCCCCT CAGCAGTGTT 
1141 CCCAGCGAGG ACGAGTGGTA CTGCCCTGAG TGCCGGAATG ATGCCAGCGA GGTGGTACTG 
1201 GCGGG AGAGC GGCTGAGAGA GAGCAAGAAG AAGGCGAAGA TGGCCTCGGC CACATCGTCC 
1261 TCACAGCGGG ACTGGGGCAA GGGCATGGCC TGTGTGGGCC GCACCAAGGA ATGTACCATC 
1321 GTCCCGTCCA ACCACTACGG ACCCATCCCG GGGATCCCCG TGGGCACCAT GTGGCGGTTC 

13 81 CGAGTCCAGG TCAGCGAGTC GGGTGTCCAT CGGCCCCACG TGGCTGGCAT ACACGGCCGG 

14 41 AG CAACG ACG GAGCGTACTC CCTAGTCCTG GCGGGGGGCT ATGAGGATGA CGTGGACCAT 
1501 GGGAATTTTT TCACATACAC GGGTAGTGGT GGTCGAGATC TTTCCGGCAA CAAGAGGACC 
1561 GCGGAACAGT CTTGTGATCA GAAACTCACC AACACCAACA GGGCGCTGGC TCTCAACTGC 
1621 TTTGCTCCCA TCAATGACCA AGAAGGGGCC GAGGCCAAGG ACTGGCGGTC GGGGAAGCCG 
1681 GTCAGGGTGG TGCGCAATGT CAAGGGTGGC AAGAATAGCA AGTACGCCCC CGCTGAGGGC 
1741 AACCGCTACG ATGGCATCTA CAAGGTTGTG AAATACTGGC CCGAGAAGGG GAAGTCCGGG 
1801 TTTCTCGTGT GGCGCTACCT TCTGCGGAGG GACGATGATG AGCCTGGCCC TTGGACGAAG 
1861 GAGGGGAAGG ACCGGATCAA GAAGCTGGGG CTGACCATGC AGTATCCAGA AGGCTACCTG 
1921 GAAGCCCTGG CCAACCGAGA GCGAGAGAAG GAGAACAGCA AGAGGGAGGA GGAGGAGCAG 
1981 CAGGAGGGGG GCTTCGCGTC CCCCAGGACG GGCAAGGGCA AGTGGAAGCG GAAGTCGGCA 
2 041 GGAGGTGGCC CGAGCAGGGC CGGGTCCCCG CGCCGGACAT CCAAGAAAAC CAAGGTGGAG 
2101 CCCTACAGTC TCACGGCCCA GCAGAGCAGC CTCATCAGAG AGGACAAGAG CAACGCCAAG 
2161 CTGTGGAATG AGGTCCTGGC GTCACTCAAG GACCGGCCGG CG AG CGGCAG CCCGTTCCAG 
2221 TTGTTCCTGA GTAAAGTGGA GGAGACGTTC CAGTGTATCT GCTGTCAGGA GCTGGTGTTC 
2281 CGGCCCATCA CGACCGTGTG CCAGCACAAC GTGTGCAAGG ACTGCCTGGA CAGATCCTTT 
2 341 CGGGCACAGG TGTTCAGCTG CCCTGCCTGC CGCTACGACC TGGGCCGCAG CTATGCCATG 
2401 CAGGTGAACC AGCCTCTGCA GACCGTCCTC AACCAGCTCT TCCCCGGCTA CGGCAATGGC 
24 61 CGGTGATCTC CAAGCACTTC TCGACAGGCG TTTTGCTGAA AACGTGTCGG AGGGCTCGTT 
2 521 CATCGGCACT GATTTTGTTC TTAGTGGGCT TAACTTAAAC AGGTAGTGTT TCCTCCGTTC 
2 581 CCTAAAAAGG TTTGTCTTCC TTTTTTTTTA TTTTTATTTT TCAAATCTAT ACATTTTCAG 
2 641 GAATTTATGT ATTCTGGCTA AAAGTTGGAC TTCTCAGTAT TGTGTTTAGT TCTTTGAAAA 
2701 CATAAAAGCC TGCAATTTCT CGACAAAACA ACACAAGATT TTTTAAAGAT GGAATCAGAA 
2761 ACTACGTGGT GTGGAGGCTG TTGATGTTTC TGGTGTCAAG TTCTCAGAAG TTG CTGCCAC 
2 821 CAACTCTTTA AG AAGG CG AC AGGATCAGTC CTTCTCTAGG GTTCTGGCCC CCAAGGTCAG 
2 881 AG CAAGCATC TTCCTGACAG CATTTTGTCA TCTAAAGTCC AGTGACATGG TTCCCCGTGG 
2 941 TGGCCCGTGG CAGCCCGTGG CATGGCGTGG CTCAGCTGTC TGTTGAAGTT GTTGCAAGGA 
3001 AAAGAGGAAA CATCTCGGGC CTAGTTCAAA CCTTTGCCTC AAAGCCATCC CCCACCAGAC 
3061 TGCTTAGCGT CTGAGATCCG CGTGAAAAGT CCTCTGCCCA CG AG AG C AGG GAGTTGGGGC 
3121 CACGCAGAAA TGGCCTCAAG GGGACTCTGC TCCACGTGGG GCCAGGCGTG TGACTGACGC 
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3181 TGTCCGACGA AGGCGGCCAC GGACGGACGC CAGCACACGA AGTCACGTGC AAGTGCCTTT 
3241 GATTCGTTCC TTCTTTCTAA AGACGACAGT CTTTGTTGTT AGCACTGAAT TATTGAAAAT 
3301 GTCAACCAGA TTCTAGAAAC TGCGGTCATC CAGTTCTTCC TGACACCGGA TGGGTGCTTG 
3361 GGAACCGTTT GAGCCTTATA GATCATTTAC ATTCAATTTT TTTAACTCAG CAAGTGAGAA 
3421 CTTACAAGAG GGTTTTTTTT TAATTTTTTT TTCTCTTAAT GAACACATTT TCTAAATGAA 
34 81 TTTTTTTTGT AGTTACTGTA TATGTACCAA GAAAGATATA ACGTTAGGGT TTGGTTGTTT 
3541 TTGTTTTTGT ATTTTTTTTC TTTTGAAAGG GTTTGTTAAT TTTTCTAATT TTACCAAAGT 
3601 TTGCAGCCTA TACCTCAATA AAACAGGGAT ATTTTAAATC ACATACCTGC AGACAAACTG 
3661 GAGCAATGTT ATTTTTAAAG GGTTTTTTTC ACCTCCTTAT TCTTAGATTA TTAATGTATT 
3721 AGGGAAGAAT GAGACAATTT TGTGTAGGCT TTTTCTAAAG TCCAGTACTT TGTCCAGATT 
3781 TTAGATTCTC AGAATAAATG TTTTTCACAG ATTGAAAAAA AAAAAAAA 

SEQ ID NO:4 
Size: 135 
PRT-NP95 

MWIQVRTMDGRQTHTVDSLSRLTKVEELRRKIQELFHVEPGLQRLFYRGKQMEDGHTLFDYEVRLNDTIQLLVRQ 

SLVLPHSTKERDSELSDTDSGCCLGQSESDKSSTHGEAAAETDSRPADEDMWDET^ 

WFEAQVVRVTRKAPSRDEPCSSTSRPALEEDVIYHVKYDDYPENGWQMNS^ 

NYNPDNPKERGFWYDAEISRKRETRTARELYAbAAn^GDDSL^ 

SCKHCKDDVNRLCRVCACHLCGGRQDPDKQLMCDECDMAFHIYC^ 

RLRESKKKAKMASATSSSQRDWGKGMACVGRTKECTIVPSNHYGPIPGIPVGTMWRFRVQVSESGVHRPHVAGIH 

GRS^roGAYSLVLAGGYEDDVDHGNFFTYTGSGGRDLSGNKRTAEQSCDQKLTNTNRAIJALNCFAPINDQEGAEAK 

DWRSGKPVRWRNVKGGKNSKYAPAEGNRYIXSIYKWKYWPEK^ 

IXSLTMQYPEGYLEAIJtfraEREKENSKREEEEQQEGGFASPRTC 

LTACK2SSLIREDKSNAKLWNEVIASLKDRPASGS 

S FRAQ VFS C PACR YDLGRS YAMQ VNQ PLQT VLNQL F PG YGNGR 
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PRI 05-JUL-2001 
A (FANCA) , mRNA . 



Craniata ; Vertebrata ; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



NM_O00135 S503 bp mRNA linear 

Homo sapiens Fanconi anemia, complementation group 
NM_000135 

NM_000135 .1 GI : 4503 654 

FANC A 

human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 

1 (bases 1 to 5503) 
Pronk JC, Oibson RA, Savoia A, Wijker M, Morgan NV f Melchionda S, 
Ford D, Temtamy S, Ortega JJ, Jansen S and et al . 

Localisation of the Fanconi anaemia complementation group A gene to 
chromosome 16q24 . 3 

Nat. Genet. 11 (3), 338-340 (1995) 

96042566 

7581462 

2 (bases 1 to 5503) 

Lo Ten Foe,J.R., Rooimans , M . A . , Bosnoyan- Coll ins , L . , Alon,N., 
Wijker, M., Parker, L. # Light foot, J. , Carreau,M., Callen,D.F., 
Savoia, A., Cheng, N.C., Van Berkel, C.G. M. , Strunk,M.H.P. , 
Gille, J. J. P. , Pals,G., Kruyt,F.A.E. , Pronk, J. C. , Arwert,F., 
Buchwald,M. and Joenje,H. 

Expression cloning of a cDNA for the major Fanconi anaemia gene, 
FAA 

Nat. Genet. 14 (3), 320-323 (1996) 
9705192B 

3 (bases 1 to 5503) 

Ianzano L, D'Apolito M, Centra M, Savino M, Levran O, Auerbach AD, 
Cleton-Jansen AM, Doggett NA, Pronk JC, Tipping AJ, Gibson RA, 
Mat hew CG, Whitraore SA, Apostolou S, Callen DF, Zelante L and 
Savoia A. 

The genomic organization of the Fanconi anemia group A (FAA) gene 

Genomics 41 (3) , 309-314 (1997) 

97312685 

9169126 

4 (bases 1 to 5503) 

Joenje H, Oostra AB, Wijker M, di Summa FM, van Berkel CG, Rooimans 

MA, Ebell W, van Weel M, Pronk JC, Buchwald M and Arwert F. 

Evidence for at least eight Fanconi anemia genes 

Am. J. Hum. Genet. 61 (4), 940-944 (1997) 

98016453 

9382107 

5 (bases 1 to 5503) 

Kupfer GM, Naf D, Suliman A, Pulsipher M and D'Andrea AD. 
The Fanconi anaemia proteins, FAA and FAC, interact to form a 
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JOURNAL 
MEDLINE 
PUB MED 
COMMENT 

FEATURES 

source 



nuclear complex 
Nat. Genet. 17 (4) , 
98061104 
9398857 

PROVISIONAL REFSEO : 
NCBI 



487-490 (1997) 



gene 



CDS 



This record has not yet been subject to final 
review. The reference sequence was derived from X99226.1 . 
Location/Qualifiers 
1..5503 

/organisms "Homo sapiens" 

/isolate= w healthy control" 

/db_xref ="taxon: 9606 " 

/chromosome = M 16 ■ 

/map="16q24 .3" 

/clone= n D" 

/cell_line="HSC93 M 

/eel l_type = " lymphob lastoid" 

/clone_lib«"pREP4 « 

1- .5503 

/ gene= "FANCA" 

/note="FA; FA1; FAA; FACA; FANCH" 

/db_xre f = M Locu s ID : 2175 " 

/db_xre f = « MIM : 227650 " 

/db xref="MIM; 603468 " 

32 . .4399 

/gene = " FANCA" 

/function^ "acts with other genes to control FA pathway" 

/note= n Fanconi anemia , complementation group H w 

/codon_start«l 

/db_xref= "LOCUS ID : 2175 " 

/dbxre f= " MIM : 227650 " 

/db_xref = " MIM : 603468 " 

/products "Fanconi anemia, complementation group A" 
/protein_id= " NP 000126. l " 
/db_xref = "GI : 4503655" 

/ 1 rans lation= "MSDS WVPNSASGQDPGGRRRAWAELLAGRVKREKYNPERAQKLK 

E S AVRLLRSHQDLNALLLEVEG PLCKKLS LS KVIDCDS SEAYANHS SS F IGS AI*QDQA 

SRI/3VPVGILSAGMVASSVGQICTAPAETSHPVXLTVEQRKKLSSLLEFAQ 

FSRLSFCQELWKIQSSLLLEAVWLIWQGIVSIiOELLESHPDMHAVGSWLFRN^ 

EQMEASCQHADVARAMLSDFVQMFVTjRGFQKNSDLRRT\raPEKMPQV^ 

ALDALAAGVQEES STHKXVRCWFGVTSGHT1X3SVI STDPLIGIFFSHTLTQILTHS PVL 

KASDAVQMQREWS FARTHPLLTS L YRRL FVMLS AEELVGHLQEVLETQEVHWQRVLS F 

VSALWCFPEAQQLLEDWARLMAQAFESCQLDSMVTAFLVWQAALEGPSAFLSYM 

W FKAS FG S TRG YHGCS KKALVFL FT FLS ELVP FES PR YLQVH I LHPPLVPS KYRS LLT 

DY I S LAKTRLADLKVS I ENMGLYEDLSSAGDITEPHSQAIiQDVEKAIMVFEHTGNI P V 

TVMEASIFRRPYYVSHFLPALLTPRVLPKVPDSRVAFIESLKRADKIPPSLYSTYCQA 

CSAAEEKPEDAALGVRAEPNSAEEPLGQLTAALGELRASMTDPSQRDVISAQVAVISE 

RLRAVXGHNEDDSSVEISKIOLSINTPRLEPREHIAVDLLLTSFCQNLMAASSVAPPE 

RQG PWAAL FVRTMCGRVL PAVLTRLCQLLRHQGPS LS APHVLGLAALAVHLGES RS AL 

PEVDVGPPAPGAGLPVPALFDSLLTCRTRDSLFFCLKFCTAAISYSLCKFSSQSRDTL 

CSCLSPGLIKKFQFLMFRLFSEARQPLSEEDVASLSWRPLHLPSADWQRAALSLWTHR 

TFREVLKEEDVHLTYQDWLHLELE IQPEADALSDTERQDFHQWAIHEHFLPES S ASGG 

CUX5DLQAACTILV1IALMDFHQSSRSYDHSENSDLVFGGRTGNEDIISRLQEMVADLEL 

QQDLIVPLGHTPSQEHFLFEIFRRRLQALTSGWSVAASLQRQRELLMYKRILIiRLPSS 

VLCGS S FQAEQPI TARCEQFFHLVNS EMRNFCSHGGALTQD ITAHFFRGLLNACLRS R 

DPSLMVDFIIJUCCQTKCPLILTSALVWWPSLEPVTjLCRWRRHCQSPLPRELQKLOEGR 

QFASDFLSPEAASPAPNPDWLSAAALHFAIQQVREENIRKQLKKLDCEREELLVFLFF 

FSLMGLLSSHLTSNSTTDLPKAFHVCAAILECLEKRKISWLALFQLTESDLRLGRLLL 

RVAPDQHTRLLPFAFYSLLSYFHEDAAIREEAFLHVAVDMYLKLVQLFVAGDTSTVSP 

PAGRSLELKGQGNPVELITKARLFLLQLIPRCPKKSFSHVAELLADRGDCDPEVSAAL 

QSRQQAAPDADLSQEPHLF M 
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48 

/allele«="A" 
/allele=«T" 

/db_xref «= "dbSNP : 1800282 " 
1174 

/allele=«G H 
/allele= w T" 

/db_xref« " dbSNP : 1800331" 
1321 

/allele= n A" 
/allele= n G" 

/db_xr ef = " dbSNP : 1800332 " 
comp 1 erne nt ( 1 5 3 2 ) 
/allele=*C" 
/allele="T" 

/ db_x r e f = " dbSNP : 2239359 " 
3214 

/allele="C" 
/allele««T" 

/db_xref = "dbSNP : 1800346 " 
3685 

/allele="A w 
/allele= w G" 

/db_xref = "dbSNP : 1800358 " 
4553 

/allele="A" 
/allele« w G w 
/ db_xre f = " dbSNP : 1230 •' 
BASE COUNT 1208 a 1527 C 1492 g 1276 t 

ORIGIN 

1 agccgccgcc ggggctgtag gcgccaaggc catgtccgac tcgtgggtcc cgaactccgc^ 
61 ctcgggccag gacccagggg gccgccggag ggcctgggcc gagctgctgg cgggaagggt 
121 caagagggaa aaatataatc ctgaaagggc acagaaatta aaggaatcag ctgtgcgcct 
181 cctgcgaagc catcaggacc tgaatgccct tttgcttgag gtagaaggtc cactgtgtaa 
241 aaaattgtct ctcagcaaag tgattgactg tgacagttct gaggcctatg ctaatcattc 
301 tagttcattt ataggctctg ctttgcagga tcaagcctca aggctggggg ttcccgtggg 
361 tattctctca gccgggatgg ttgcctctag cgtgggacag atctgcacgg ctccagcgga 
421 gaccagtcac cctgtgctgc tgactgtgga gcagagaaag aagctgtctt ccctgttaga 
481 gtttgctcag tatttattgg cacacagtat gttctcccgt ctttccttct gtcaagaatt 
541 atggaaaata cagagttctt tgttgcttga agcggtgtgg catcttcacg tacaaggcat 
601 tgtgagcctg caagagctgc tggaaagcca tcccgacatg catgctgtgg gatcgtggct 
661 cttcaggaat ctgtgctgcc tttgtgaaca gatggaagca tcctgccagc atgctgacgt 
721 cgccagggcc atgctttctg attttgttca aatgtttgtt ttgaggggat ttcagaaaaa 
781 ctcagatctg agaagaactg tggagcctga aaaaatgccg caggtcacgg ttgatgtact 
841 gcagagaatg ctgatttttg cacttgacgc tttggctgct ggagtacagg aggagtcctc 
901 cactcacaag atcgtgaggt gctggttcgg agtgttcagt ggacacacgc ttggcagtgt 
961 aatttccaca gatcctctga agaggttctt cagtcatacc ctgactcaga tactcactca 
1021 cagccctgtg ctgaaagcat ctgatgctgt tcagatgcag agagagtgga gctttgcgcg 
1081 gacacaccct ctgctcacct cactgtaccg caggctcttt gtgatgctga gtgcagagga 
1141 gttggttggc catttgcaag aagttctgga aacgcaggag gttcactggc agagagtgct 
1201 ctcctttgtg tctgccctgg ttgtctgctt tccagaagcg cagcagctgc ttgaagactg 
1261 ggtggcgcgt ttgatggccc aggcattcga gagctgccag ctggacagca tggtcactgc 
13 21 gttcctggtt gtgcgccagg cagcactgga gggcccctct gcgttcctgt catatgcaga 
1381 ctggttcaag gcctcctttg ggagcacacg aggctaccat ggctgcagca agaaggccct 
1441 ggtcttcctg tttacgttct tgtcagaact cgtgcctttt gagtctcccc ggtacctgca 
1501 ggtgcacatt ctccacccac ccctggttcc cagcaagtac cgctccctcc tcacagacta 
1561 catctcattg gccaagacac ggctggccga cctcaaggtt tctatagaaa acatgggact 
1621 ctacgaggat ttgtcatcag ctggggacat tactgagccc cacagccaag ctcttcagga 
1681 tgttgaaaag gccatcatgg tgtttgagca tacggggaac atcccagtca ccgtcatgga 
1741 ggccagcata ttcaggaggc cttactacgt gtcccacttc ctccccgccc tgctcacacc 



variation 



variation 



variation 



variation 



variation 



variation 
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1801 tcgagtgctc 
1861 agataaaatc 
1921 gaagccagaa 
1981 gggacagctc 
2041 tgatgttata 
2101 caatgaggat 
2161 actggagcca 
2221 gatggctgcc 
2281 gaggaccatg 
2341 tcaccagggc 
2401 cctgggtgag 
2461 tggccttcct 
2521 gttcttctgc 
2581 ccagtcacga 
2641 cctcatgttc 
2701 cctttcctgg 
2761 ctggacacac 
2821 agactggtta 
2881 acggcaggac 
2941 agggggctgt 
3001 tttccaccaa 
3061 ccgcacagga 
3121 gcagcaagac 
3181 gattttccgc 
3241 gagacagagg 
3301 ctgcggcagc 
3361 cttggtcaac 
3421 cactgcccac 
3481 gatggtcgac 
3541 tctggtgtgg 
3601 gagcccgctg 
3661 cctctcccct 
3721 gcactttgcg 
3781 ctgcgagaga 
3841 gtcacatctg 
3901 aatcctcgag 
3961 gagtgacctc 
4021 gctgcctttc 
4081 agaggccttc 
4141 tggggataca 
4201 caaccccgtg 
4261 cccgaaaaag 
4321 agaggtgagc 
4381 ggagcctcat 
4441 aatttattac 
4501 tcgactgctt 
4561 cagcgtttgc 
4621 caccaagccc 
4681 cacacgaagg 
4741 ctcactgccg 
4801 ggctcggtgg 
4861 tcagcgggtg 
4 921 cacactggtc 
4 981 tgagggatgc 
5041 gataagaagg 
5101 gcttggctcc 
5161 gtccacattc 
5221 ggcccctgat 
5281 gggaagacat 
5341 gcctcgttta 
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cccaaagtcc 

cccccatctc 

gatgcagccc 

acagctgcac 

tcggcacagg 

gacagcagcg 

cgggaacaca 

tccagtgtcg 

tgtggacgtg 

ccgagcctga 

tccaggtctg 

gtccctgcgc 

ctgaaatttt 

gatactttgt 

agattgttct 

agacccttgc 

agaaccttcc 

cacctggagc 

ttccaccagt 

gacggagacc 

agctcaagga 

aatgaggata 

ctcatagtgc 

agacggctcc 

gagctgctaa 

agcttccagg 

tctgagatga 

ttcttcaggg 

ttcatactgg 

tggccgagcc 

ccccgggaac 

gaggctgcct 

attcaacaag 

gaggagctat 

acctcaaata 

tgtttagaga 

aggctggggc 

gctttttaca 

ctgcatgttg 

agcacagttt 

gaactgataa 

agcttctcac 

gccgccctcc 

ctcttctgac 

aagcataaca 

tagtggggaa 

gggaaaataa 

gcctccagca 

gtgaggctga 

tcctcaggtg 

cccaggtggt 

caccatggac 

acaggcaaag 

ccgctgcctg 

tgcgaggcca 

cgaatgtcgc 

gtcacagata 

gctccaacct 

ttctgcacat 

ttaagatctt 



ctgactcccg tgtggcgttt 
tgtactccac ctactgccag 
tgggagtgag ggcagaaccc 
tgggagagct gagagcctcc 
tggcagtgat ttctgaaaga 
ttgagatatc aaagattcag 
ttgctgtgga cctcctgctg 
ctcccccgga gaggcagggt 
tgctccctgc agtgctcacc 
gtgccccaca tgtgctgggg 
cgctcccaga ggtggatgtg 
tctttgacag cctcctgacc 
gtacagcagc aatttcttac 
gcagctgctt atctccaggc 
cagaggcccg acagcctctt 
accttccttc tgcagactgg 
gagaggtgtt gaaagaggaa 
tggaaattca acctgaagct 
gggcgatcca tgagcacttt 
tgcaggctgc gtgtaccatt 
gttatgacca ctcagaaaat 
ttatttccag attgcaggag 
ctctcggcca caccccttcc 
aggctctgac aagcgggtgg 
tgtacaaacg gatcctcctc 
cagaacagcc catcactgcc 
gaaacttctg ctcccacgga 
gcctcctgaa cgcctgtctg 
ccaagtgcca gacgaaatgc 
tggagcctgt gctgctctgc 
tgcagaagct acaagaaggc 
ccccagcacc caacccggac 
tcagggaaga aaacatcagg 
tggttttcct tttcttcttc 
gcaccacaga cctgccaaag 
agaggaagat atcctggctg 
ggctcctcct ccgtgtggcc 
gtcttctctc ctacttccat 
ctgtggacat gtacttgaag 
cacctccagc tggcaggagc 
caaaagctcg tctttttctg 
acgtggcaga gctgctggct 
agagcagaca gcaggctgcc 
gggacctgcc actgcacacc 
tggagctctt gttgcactaa 
aggaatcaat tatttatgaa 
accactggtc ccagagcaga 
ccaagggcgg gcagcaccct 
cacagccact gcggagtcca 
ggttcgggct tcaccgcctg 
ggttccgcct ccaggggcag 
atgtgtacat tgaggttgtg 
tccagctcag tctcagcctt 
cactggaacc cacagacctc 
cagccctggg agggggtcct 
atttggtgga cgagaaggtg 
tagttccgca cctctgagag 
cccgggggga cgacgatgac 
ggttcaccat gcagtgggcc 
taaactgctt tatacactgt 



atagagtctc 
gcctgctctg 
aactctgctg 
atgacagacc 
ctgagggctg 
ctcagcatca 
acgtctttct 
ccctgggctg 
cggctctgcc 
ttggctgccc 
ggtcctcctg 
tgtaggacga 
tctctctgca 
cttattaaaa 
tctgaggagg 
cagagagctg 
gatgttcact 
gatgctcttt 
ctccctgagt 
cttgtcaacg 
tctgatttajg 
atggtagctg 
caggagcact 
agcgtggctg 
cgcctgcctt 
agatgcgagc 
ggtgccctga 
cggagcagag 
cccttaattt 
cggtggagga 
cggcagtttg 
tggctctcag 
aagcagctaa 
tccttgatgg 
gctttccacg 
gcactctttc 
ccggatcagc 
gaagacgcgg 
ctggtccagc 
ctggagctca 
ctgcagttaa 
gatcgtgggg 
cctgacgctg 
agcccagctc 
aaagtggatt 
ctgtccggcc 
ggaaggctac 
ccgaccctcc 
ggctgctaga 
gccctctgtg 
ggccttgtcc 
ggccttctca 
gtgtttggtc 
acacctgggg 
gactcacact 
cttccgctgc 
999agagtcc 
aatgtgaaac 
caagcaaggg 
cacgtggctt 



tgaagagagc 
ctgctgaaga 
aggagcccct 
ccagccagcg 
tcctgggcca 
acacgccgag 
gtcagaacct 
ccctcttcgt 
agctgctccg 
tggccgtgca 
cacctggtgc 
gggattcctt 
agttttcttc 
agtttcagtt 
acgtagccag 
ccctctctct 
taacttacca 
cagatactga 
cctcggcttc 
cactgatgga 
tctttggtgg 
acctggagct 
tcctctttga 
ccagccttca 
cgtctgtcct 
agttcttcca 
cacaggacat 
acccctccct 
tgacctctgc 
gacactgcca 
ccagcgattt 
ctgctgcact * 
agaagctgga 
gcctgctgtc 
tttgtgcagc 
agttgacaga 
acaccaggct 
ccatcaggga 
tcttcgtggc 
agggtcaggg 
tacctcggtg 
actgcgaccc 
acctgtccca 
ccgtgtaaat 
acaaatctcc 
ccgagtcact 
ttgagccgga 
catgcgggtg 
ggtgctcatc 
gtcacagagg 
tgggtctgtg 
aaccgccggc 
atgtggtact 
gacagaggca 
tactgcaaag 
ttgaaggttt 
agtgagtcca 
catcacagct 
gcctatgagg 
catcagctgt 
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5401 gtgcatttca ggatggtttt taaagaaacc tcagaaagct atttccttaa aaaaaaaaaa 
5461 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaa 

// 
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LOCUS NM_030588 1378 bp mRNA linear PRI 02-APR-2001 

DEFINITION Homo sapiens DEAD/H (Asp-Glu -Ala-Asp/His ) box polypeptide 9 (RNA 

helicase A, nuclear DNA helicase II; leukophyoin) (DDX9) , 

transcript variant 2, mRNA. 

ACCESSION NM_03 0588 

VERSION NM_030588.1 GI: 13514821 FfcFlYO 
KEYWORDS UlJJLy 

SOURCE human . 

ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

REFERENCE 1 (bases 1 to 1378) 

AUTHORS Lee,C.G. and Hurwitz.J. 

TITLE A new RNA helicase isolated from HeLa cells that catalytically 

translocates in the 3' to 5' direction 

JOURNAL J. Biol. Chem. 267 (7) , 4398-4407 (1992) 

MEDLINE 92165790 

PUBMED 1537828 ' 

REFERENCE 2 (bases 1 to 1378) 

AUTHORS Lee,C.G., Zaraore,P.D., Green , M - R . and Hurwitz,J. 

TITLE RNA annealing activity is intrinsically associated with U2AF 

•JOURNAL J. Biol. Chem. 268 (18), 13472-13478 (1993) 

MEDLINE 93293869 

PUBMED 7685763 

REFERENCE 3 (bases 1 to 1378) 

AUTHORS Lee,C.G. and Hurwifcz,J. 

TITLE Human RNA helicase A is homologous to the maleless protein of 
Drosophila 

JOURNAL J. Biol. Chem. 268 (22), 16822-16830 (1993) 

MEDLINE 93346440 

PUBMED 8344961 

REFERENCE 4 (bases 1 to 1378) 

AUTHORS Abdelhaleem,M.M. , Hameed,S., Klassen,D. and Greenberg, A.H. 

TITLE Leukophysin: an RNA helicase A-related molecule identified in 

cytotoxic T cell granules and vesicles 

JOURNAL J. Immunol. 156 (6), 2026-2035 (1996) 

MEDLINE 96310937 

PUBMED 8690889 

REFERENCE 5 (bases 1 to 1378) 

AUTHORS Zhang, S. and Grossed. 

TITLE Domain structure of human nuclear DNA helicase II (RNA helicase A) 

JOURNAL J. Biol. Chem. 272 (17), 11487-11494 (1997) 

MEDLINE 97269062 

PUBMED 9111062 

REFERENCE 6 (bases 1 to 1378) 

AUTHORS Nakajima,T., Uchida,C. , Anderson, S . F. , Lee,C.G., Hurwitz,j., 
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TITLE 
JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 



JOURNAL 
MEDLINE 
PUBMED 
COMMENT 



FEATURES 

source 



gene 



variation 
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Parvin,J.D. and Montminy,M. 

RNA helicase A mediates association of CBP with RNA polymerase II 

Cell 90 (6) , 1107-1112 (1997) 

97462911 

932313B 

7 (bases l to 1378) 

Lee,C.G., da Costa Soares,V., Newberger , C . , Manova,K., Lacy,E. and 
Hurwitz , J . 

RNA helicase A is essential for normal gastrulation 
Proc. Natl. Acad. Sci . U.S.A. 95 (23) , 13709-13713 (1998) 
99030634 
9811865 

8 (bases 1 to 1378) 

Lee,C.G., Eki,T., Okumura,K., Nogami,M., Soares,Vd., Murakami , Y., 
Hanaoka,F. and Hurwitz , J. 

The human RNA helicase A (DDX9) gene maps to the prostate cancer 
susceptibility locus at chromosome band lq25 and its pseudogene 
(DDX9P) to 13q22, respectively 
Somat. Cell Mol. Genet. 25 (1), 33-39 (1999) 
20381755 
10925702 

REVIEWED REFSEQ: This record has been curated by NCBI staff. The 
reference sequence was derived from U03643 . l . 

Summary: DEAD box proteins, characterized by the conserved motif 
Asp-Glu-Ala-Asp (DEAD) , are putative RNA helicases. They are 
implicated in a number of cellular processes involving alteration 
of RNA secondary structure such as translation initiation, nuclear 
and mitochondrial splicing, and ribosome and spliceosome assembly. 
Based on their distribution patterns, some members of this family 
are believed to be involved in embryogenesis, spermatogenesis, and 
cellular growth and division. This gene includes 2 alternatively 
spliced transcripts, encoding 2 different isoforms. The larger 
isoform is a DEAD box protein with RNA helicase activity. It may 
participate in melting of DN A : RNA hybrids, such as those that occur 
during transcription, and may play a role in X- linked gene 
expression. It contains 2 copies of a double -stranded RNA-binding 
domain, a DEXH core domain and an RGG box. The RNA-binding domains 
and RGG box influence and regulate RNA helicase activity. The 
smaller isoform is a lymphocyte granule protein. It lacks 
RNA-binding domains and DEXH core domain, but contains an RGG box, 
which may render this isoform RNA binding function. 
Transcript Variant: This variant (2) is missing a 104 nt internal 
fragment, in addition to 2722 nt in the S % UTR, as compared to 
variant 1. It encodes the smaller isoform, which is associated with 
lymphocyte granules. 

COMPLETENESS: complete on the 3' end. 
Location/Qualifiers 
1. .1378 

/organisms "Homo sapiens" 
/db_xr e f = " t axon : 9 6 0 6 " 
/chromosome- " 1 " 
/maps="lq25" 
1..1378 
/gene="DDX9 M 
/note="LKP; NDHII; RHA" 
/db_xre f «=" Locus ID : 1660 " 
/db~xref== w MIM: 603115 " 
35 

/allele«"A" 
/allele="G" 
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variation 
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miscf eature 
variation 

variation 

variation 

variation 

variation 

variation 

variation 



PQlvA sig nal 

PQlvA site 

BASE COUNT 3 69 

ORIGIN 



RNA 



/db_xref «="dbSNP : 1049264 " 
51 

/allele=«A« 
/allele="G w 

/db_xref ="dbSNP : 104 9265 " 
52 

/allele= w A M 
/allele» M G" 

/ db_xr e f dbSNP : 1049266 M 
358.. 1065 
/gene="DDX9" 

/note= n isoform 2 is encoded by transcript variant 2; 
helicase A; leukophysin; DEAD/H box- 9; nuclear DNA 
helicase II; ATP-dependent RNA helicase A" 
/codon_start=l 
/db_xre f = M Locus ID : 1660 " 
/db_xr e f =« M I M : 603115 n 

/product= -DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 9, 
isoform 2" 

/protein_id= " NP 085077. 1 " 
/db_xref«»GI: 13514822" 

/trans lat ion= "MKYPSPFFVFGEKIRTRAISAKGMTLVTPLQLLLFASKKVQSDG 
QIVXTODWIKLQISHEAAACITGLRAAMEALWEVT^^ 

ISRPSAAGINIiMIGSTRYGDGPRPPKMARYDNGSGYRRGGSSYSGGGYGGGYSSGGYG 
SGG YGGSANSFRAGYGAGVGGGYRG VS RGGFRGNSGGD YRGPSGG YRGS GGFQRGGGR 
GAYGTGYFGQGRGGGG Y n 
760. .1062 

/note="Arg/Gly/Ser/Tyr-rich domain; Region: RGG box" 
1146 

/allele^A" 
/allele= M T" 
/db_xref=«dbSNP:661 M 
1187 

/allele= w G n 

/allele= M T M 

/db_xr e f = " dbSNP : 865 w 

1236 

/allele= M G M 

/allele= w T M 

/db^xr e f = " dbSNP : 860" 

1240 

/allele^A" 

/allele«="T M 

/db xref ="dbSNP: 864 " 

1293 

/allele^C" 

/allele^T" 

/db xref ="dbSNP: 863 " 

1297 

/allele= w A" 

/allele= w T" 

/ db_xre f =" dbSNP : 866 n 

1318 

/allele- "A" 

/allele^T" 

/db_xre f =" dbSNP : 862 " 

1362 . .1367 

1378 

a 261 c 351 g 397 t 
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FIG. 8 

1 cattgctgct gctacctgct ttccagagcc 
61 tatccatcga aattttgctg gaaacagatt 
121 ccaagcctgg gatgatgcta gaatgggtgg 
181 caaaagactt aatatggcta cactaagaat 
241 gattttgatt aattctgggt ttccagaaga 
301 tggaccagat aataatttgg atgttgttat 
361 aagtacccat ctcccttctt tgtatttggt 
421 aaaggcatga ctttagtcac ccccctgcag 
481 tctgatgggc agattgtgct tgtagatgac 
541 gctgcctgta tcactggtct ccgggcagcc 
601 caacctgcta tcatcagcca gttggacccc 
661 cagatctcta gaccctcagc tgctggtatc 
721 gatggtccac gtcctcccaa gatggcccga 
781 ggttctagtt acagtggtgg aggctatggc 
841 ggaggctatg gtggcagcgc caactccttt 
901 ggctatagag gagtttcccg aggtggcttt 
961 cctagtggag gctacagagg atctggggga 
1021 ggaactggct actttggaca gggaagagga 
1081 gttcctgtgt gtagacagta aggaaaaaaa 
1141 tatgtttatt tgccaccaaa aagtaaatgc 
1201 ttaaggaaac caagcatata gatgcattag 
1261 gatctcttaa aaataccaca gtttgtattt 
1321 ataacttggt attttcctgg ctttcgttta 



(4/4) 

tttcatcaat gaaggaaagc ggctgggcta 
ttctgatcac gtagcccttt tatcagtatt 
agaagaagca gagatacgtt tttgtgagca 
gacctgggaa gccaaagttc agctcaaaga 
ttgtttgttg acacaagtgt ttactaacac 
ctccctcctg gcctttgtag ccaagacatg 
gaaaagattc gaactcgagc catctctgct 
ttgcttctct ttgcctccaa gaaagtccaa 
tggattaaac tgcaaatatc tcatgaagct 
atggaggctt tggttgttga agtaaccaaa 
gtaaatgaac gtatgctgaa catgatccgt 
aaccttatga ttggcagtac acggtatgga 
tacgacaatg gaagcggata tagaagggga 
ggtggctata gcagtggagg ctatggtagc 
cgggcaggat atggtgcagg tgttggtgga 
agaggcaact ctggaggaga ctacagaggg 
ttccagcgag gaggtggtag gggggcctat 
ggtggcggct attaaaactt ggttatgtca 
ggcatgctat gtgttacgtg ttttttccag 
attttcaccc attctgtggt tcattgtagt 
tgattttgtt tatattatSjt aaaatataac 
tttctttaag gagtaaagat ttgcctttaa 
atacaataga aaataaagta ttacaccg 
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JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 

PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 

PUBMED 



NM_000875 4989 bp mRNA linear 

Homo sapiens insulin- like growth factor 1 receptor 
NM_000875 

NM 000875.2 GI: 11068002 



PRI 01-NOV-2000 
(IGF1R) , mRNA. 



Craniata; Vertebrata; Euteleostomi; 
Catarrhini ; Hominidae ; Homo . 



IGFI-R 

human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates ; 

1 (bases 1 to 4989) 

Flier JS, Usher P and Moses AC. 

Monoclonal antibody to the type I insulin- like growth factor 
(IGF-I) receptor blocks IGF-I receptor-mediated DNA synthesis: 
clarification of the mitogenic mechanisms of IGF-I and insulin in 
human skin fibroblasts 

Proc. Natl. Acad. Sci. U.S.A. 83 (3), 664-668 (1986) 

86121000 

3003744 

2 (bases 1 to 4989) 

Francke U, Yang-Feng TL, Brissenden JE and Ullrich A. 
Chromosomal mapping of genes involved in growth control 
Cold Spring Harb. Symp. Quant. Biol. 51 Pt 2, 855-866 (1986) 
87217109 
3107886 

3 (bases 1 to 4989) 

Ullrich, A. , Gray, A- , Tarn, A. W., Yang-Feng, T. , Tsubokawa,M. , 
Collins, C, Henzel,W., Bon , T . L . , Kathuria,S., Chen,E., Jakobs,S., 
Francke , U. , Rama chandran , J . and Fuj i ta- Yamaguchi , Y . 

Insulin-like growth factor I receptor primary structure: comparison 

with insulin receptor suggests structural determinants that define 

functional specificity 

EMBO J. 5 (10), 2503-2512 (1986) 

870S3815 

4 (bases 1 to 4989) 

Cooke DW, Bankert LA, Roberts CT Jr, LeRoith D and Casella SJ. 
Analysis of the human type I insulin-like growth factor receptor 
promoter region 

Biochem. Biophys . Res. Commun. 177 (3), 1113-1120 (1991) 

912827S1 

1711844 

5 (bases 1 to 4989) 

Abbott AM, Bueno R, Pedrini MT, Murray JM and Smith RJ. 

Insulin-like growth factor I receptor gene structure 

J. Biol. Chem. 267 (15), 10759-10763 (1992) 

92268129 

1316909 
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REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
COMMENT 



FEATURES 

source 



gene 



6 (bases 1 to 4989) 

Werner H, Karnieli E, Rauscher FJ and LeRoith D . 

Wild-type and mutant p53 differentially regulate transcription of 

the insulin- like growth factor I receptor gene 

Proc. Natl. Acad. Sci. U.S.A. 93 (16), 8318-8323 (1996) 

96323219 

8710868 

7 (bases 1 to 4989) 

Grant ES, Ross MB , Ballard S, Naylor A and Habib FK. 

The insulin-like growth factor type I receptor stimulates growth 

and suppresses apoptosis in prostatic stromal cells 

J. Clin. Endocrinol. Metab. 83 (9), 3252-3257 (1998) 

98417960 

9745438 

REVIEWED REFSEO: This record has been curated by NCBI staff. The 
reference sequence was derived from X04434 . 1 . M69229. 1 . 
On Nov 1, 2OO0 this sequence version replaced *gi : 4557664 . 
Summary: This receptor binds insulin-like growth factor with a high 
affinity. It has tyrosine kinase activity- The insulin-like growth 
factor I receptor plays a critical role in transformation events. 
Cleavage of the precursor generates alpha and betaP.subunits . It is 
highly overexpressed in most malignant tissues where it functions 
as an ant i-apopt otic agent by enhancing cell survival. 

Location/Qualifiers 

1. .4989 

/organisra= fl Homo sapiens" 
/db_xref =" taxon : 9606 " 
/chromosome= 91 15 " 
/map= w 15q25-q26 ,, 

/clone=" (lambda) IGF-l-R. 85, (lambda) IGF-1-R.76" 

/tissue_type= "placenta" 

/clone_lib=" (lamda) gtlO" 

1..4989 

/gene="IGFlR« 

/note="JTK13" 

/db_xre f= "Locus ID : 3480 " 

/db xref «="MIM: 147370 " 

46. .4149 

/gene«"IGFlR" 

/EC_number= " 2 . 7 . 1 . 112 » 

/codon_s t art = 1 

/db_xre f = ■ Locus I D : 3480 " 

/db M Xref~"MIM: 147370 " 

/product= "insulin- like growth factor 1 receptor precursor" 
/protein id«" NP 000866.1 " 
/db_xref="GI: 4557665" 

/ trans la tion="MKSGSGGGSPTSLWGLLFLSAALSLWPTSGEICGPGIDIRNDYQ 
QLKRLENCTVIEGYLHILLISKAEDYRSYRFPKLTVITEYLLLFRVAGLESLGDLFPN 
LTVIRGWKLFYNYALVIFEMTNLKDIGLYNLRNITRGAIRIEKNADLCYLSTVDWSLI 
LDAVSNNYIVGNKPPKECGDLCPGTMEEKPMCEKTTINNEYNYRCWTTNRCQKMCPST 
CG KRACTENNECCHPECLGS CS APDNDTACVACRHYYYAG VCVPACP PNT YRFEGWRC 
VDRDFCANILSAESSDSEGFVIHDGECMQECPSGFIRNGSQSMYCIPCEGPCPKVCEE 
EKKTKTIDSVTSAQMLQGCTIFKGNLLINIRRGNNIASELENFMGLIEVVTGYVKIRH 
S HALVS LS FLKNLRLI LG EEQLEGNYS F YVLDNQNLQQLWDWDHRNLT I KAGKMYFAF 
NPKLCVSEI YRMEEVTGTKGRQSKGDINTRNNGERASCESDVLHFTSTTTSKNRI I IT 
WHRYRPPDYRDLISFTVYYKEAPFKNVTEYDGQDACGSNSWNMV^VDLPPNKDVEPGI 
LLHGLKPWTQYAVYVKAVTLTMVENDHIRGAJCSEILYIRTNASVPSIPLDVLSASNSS 
SQLIVKWNPPSLPNGNLSYYIVRWQRQPQDGYLYRHNYCSKDKIPIRKYADGTIDIEE 
VT*ENPKTEVCGGEKGPCCACPKTEAEKQAEKEEAEYRKVFENFLHNSIFVPRPERKRR 
DV>IQVANTTMSSRSRNTTAADTYNITDPEELETEYPFFESRVDNKERTVISNLRPFTL 
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misc 


feature 


variation 


misc 


feature 


misc 
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misc 
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misc 
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misc 
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misc 
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misc 


feature 


misc 


feature 



YRIDIHSCNHEAEKLGCSASNFVFARTMP/ 
NPNGL I LM YE I KYG S QVEDQ RE CVS RQE YRKYGGAKLNRLNPGNYTAR I Q ATS LSGNG 
SWTDPVFFYVQAXTGYENFIHIiI IAIiPVAVTiLIVGGLVIMLYVFHRKRNNSRLGNGVL 
YASVNPEYFSAADVYVPDEWEVAREKITMSRELGQGSFGMVYEGVAKGV^^ 
AIKTVNEAASMRERIEFI^E^VMKEFNCHHVVRLIXSWSQGO 

S YLRS LRPErt ENKPVXiAP PS LS KM I QMAGE I ADGMAYLN ANKFVHRDLAARNCMVAED 

FTVKIGDPGrraU)IYETDYYRKGGKGLIiPWWMSPESLKIX3VFTTYSDV^SFGW 

IATI*AEQPY(X»IiSNEQvliRFVT1EGGLI^KPDNCPDMLFELMRMCWQYNPKMRPSFLEI 

ISSIKEEMEPGFREVSFYYSEENKLPEPEELDLEPENMESVPLDPSASSSSLPLPDRH 

SGHKAENGPGPGVLVLRASFDERQPYAHMNGGRKNERAIiPLPQSSTC" 

46. .123 

121. .4134 

/product="IGF-I receptor" 
122..22S1 

/note="alpha-subunit (AA 1 - 710) " 
182. .190 

/note= "pot .N-linked glycosylation site (AA 21.- 23)" 
196. .561 

/note= w Re cep_L_domain; Region: Receptor L domain" 
335.. 343 

/note= "pot .N-linked glycostlation site 72 - 74)" 

434.-442 

/note= "pot. N- linked glycostlation site (AA 105 - 107)" 
568. .1044 

/note= "Furin- like; Region: Furin-like cysteine rich 

region" 

724. .852 

/note="FU; Region: Furin-like repeats" 
761. .769 

/note="pot .N-linked glycostlation site (AA 214 - 216)" 
971.. 979 

/note="pot .N-linked glycostlation site (AA 284 - 286)"" 
1162. .1410 

/notes "Recep_L_domain; Region: Receptor L domain" 
1280.. 1288 

/notes-pot .N-linked glycostlation site (AA 387 - 389)" 
1343. .1351 

/notes "pot .N-linked glycosylation site (AA 408 - 410)" 
1631. .1639 

/notes "pot .N-linked glycostlation site (AA 504 - 506)" 
1731 

/allele«"A" 
/allele="G" 

/db_xrefs "dbSNP : 2228531 " 
1850. .1858 

/notes«pot .N-linked glycosylation site (AA 577 - 579)" 
1895. .1903 

/notes w pot .N-linked glycosylation site (AA 592 - 594)" 
1949.. 1957 

/note="pot .N-linked glycosylation site (AA 610 - 612)" 
2240.. 2251 

/notes«putative proreceptor processing site (AA 707 - 
710) " 

2252.. 4132 

/note="beta-subunit (AA 711 - 1337)" 
2270. .2278 

/notes "pot .N-linked glycosylation site (AA 717 - 719J " 
2297. .2305 

/notes« P ot .N-linked glycosylation site (AA 726 - 728)" 
2321. .2329 
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feature 


misc 
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variation 



variation 



BASE COUNT 1216 
ORIGIN 

1 tttttttttt 
61 ggaggagggt 
121 tggccgacga 
181 ctgaagcgcc 
241 aaggccgagg 
301 ctgctgttcc 
361 atccgcggct 
421 aaggatattg 
481 aatgctgacc 
541 aataactaca 
601 atggaggaga 
661 tggaccacaa 
721 gagaacaatg 
781 acggcctgtg 
841 cccaacacct 
901 ctcagcgccg 
961 gagtgcccct 
1021 ggtccttgcc 
1081 tctgctcaga 
1141 cgggggaata 
1201 ggctacgtga 
1261 cgcctcatcc 
1321 cagaacttgc 
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/note="pot .N- linked glycosy ration Sitrfi-Utt-frt »*'f3W 
2548.. 2796 

/note="fn3; Region: Fibronectin type III domain" 
2729. .2737 

/note="pot .N- linked glycosylation site (AA 870 - 872)" 
2768. .2776 

/note= "pot .N- linked glycosylation site (AA 883 - 885) M 
2836.. 2910 

/ note = " transmembrane region (AA 906 - 929) ; 
transmembrane - regi on site" 
2918.. 2926 

/note="pot .N-linked glycosylation site (AA 933 - 935)" 
3040. .3834 

/note="pkinase; Region: Eukaryotic protein kinase domain" 
3040. .3843 

/note«"TyrKc; Region: Tyrosine kinase, catalytic domain" 
3047.. 3049 

/ note= "pot . ATP binding site (AA 976)" 
3052. .3807 

/note="S_TKc; Region: Serine /Threonine protein kinases, 
catalytic domain" 
3053.. 3055 

/ note= "pot . ATP binding site (AA 978)" 
3062. .3064 

/ notes "pot . ATP binding site (AA 981)" 
3128. .3130 

/note="pot .ATP binding site (AA 1003)" 
4267 

/allele="A" 
/allele="T" 

/db xref«"dbSNP: 1065304 " 
4268 

/allele="A" 
/allele="T" 

/db xref="dbSNP: 1065305 " 
a 1371 c 1320 g 1082 t 



ttttgagaaa 
ccccgacctc 
gtggagaaat 
tggagaactg 
actaccgcag 
gagtggctgg 
ggaaactctt 
ggctttacaa 
tctgttacct 
ttgtggggaa 
agccgatgtg 
accgctgcca 
agtgctgcca 
tagcttgccg 
acaggtttga 
agagcagcga 
cgggcttcat 
cgaaggtctg 
tgctccaagg 
acattgcttc 
agatccgcca 
taggagagga 
agcaactgtg 



gggaatttca 
gctgtggggg 
ctgcgggcca 
cacggtgatc 
ctaccgcttc 
cctcgagagc 
ctacaactac 
cctgaggaac 
ctccactgtg 
taagccccca 
tgagaagacc 
gaaaatgtgc 
ccccgagtgc 
ccactactac 
gggctggcgc 
ctccgagggg 
ccgcaacggc 
tgaggaagaa 
atgcaccatc 
agagctggag 
ttctcatgcc 
gcagctagaa 
ggactgggac 



tcccaaataa 
ctcctgtttc 
ggcatcgaca 
gagggctacc 
cccaagctca 
ctcggagacc 
gccctggtca 
attactcggg 
gactggtccc 
aaggaatgtg 
accatcaaca 
ccaagcacgt 
ctgggcagct 
tatgccggtg 
tgtgtggacc 
tttgtgatcc 
agccagagca 
aagaaaacaa 
ttcaagggca 
aacttcatgg 
ttggtctcct 
gggaattact 
caccgcaacc 



aaggaatgaa 
tctccgccgc 
tccgcaacga 
tccacatcct 
cggtcattac 
tcttccccaa 
tcttcgagat 
gggccatcag 
tgatcctgga 
gggacctgtg 
atgagtacaa 
gtgggaagcg 
gcagcgcgcc 
tctgtgtgcc 
gtgacttctg 
acgacggcga 
tgtactgcat 
agaccattga 
atttgctcat 
ggctcatcga 
tgtccttcct 
ccttctacgt 
tgaccatcaa 



gtctggctcc 
gctctcgctc 
ctatcagcag 
gctcatctcc 
cgagtacttg 
cctcacggtc 
gaccaatctc 
gattgagaaa 
tgcggtgtcc 
tccagggacc 
ctaccgctgc 
ggcgtgcacc 
tgacaacgac 
tgcctgcccg 
cgccaacatc 
gtgcatgcag 
cccttgtgaa 
ttctgttact 
taacatccga 
ggtggtgacg 
aaaaaacctt 
cctcgacaac 
agcagggaaa 
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1381 atgtactttg ctttcaatcc caaattatgt gtttccgaaa tttaccgcatr^g^gagga'a^ft^* 
1441 acggggacta aagggcgcca aagcaaaggg gacataaaca ccaggaacaa cggggagaga 
1501 gcctcctgtg aaagtgacgt cctgcatttc acctccacca ccacgtcgaa gaatcgcatc 
1561 atcataacct ggcaccggta ccggccccct gactacaggg atctcatcag cttcaccgtt 
1621 tactacaagg aagcaccctt taagaatgtc acagagtatg atgggcagga tgcctgcggc 
1681 tccaacagct ggaacatggt ggacgtggac ctcccgccca acaaggacgt ggagcccggc 
1741 atcttactac atgggctgaa gccctggact cagtacgccg tttacgtcaa ggctgtgacc 
1801 ctcaccatgg tggagaacga ccatatccgt ggggccaaga gtgagatctt gtacattcgc 
1861 accaatgctt cagttccttc cattcccttg gacgttcttt cagcatcgaa ctcctcttct 
1921 cagttaatcg tgaagtggaa ccctccctct ctgcccaacg gcaacctgag ttactacatt 
1981 gtgcgctggc agcggcagcc tcaggacggc tacctttacc ggcacaatta ctgctccaaa 
2041 gacaaaatcc ccatcaggaa gtatgccgac ggcaccatcg acattgagga ggtcacagag 
2101 aaccccaaga ctgaggtgtg tggtggggag aaagggcctt gctgcgcctg ccccaaaact 
2161 gaagccgaga agcaggccga gaaggaggag gctgaatacc gcaaagtctt tgagaatttc 
2221 ctgcacaact ccatcttcgt gcccagacct gaaaggaagc ggagagatgt catgcaagtg 
2281 gccaacacca ccatgtccag ccgaagcagg aacaccacgg ccgcagacac ctacaacatc 
2341 accgacccgg aagagctgga gacagagtac cctttctttg agagcagagt ggataacaag 
2401 gagagaactg tcatttctaa ccttcggcct ttcacattgt accgcatcga tatccacagc 
2461 tgcaaccacg aggctgagaa gctgggctgc agcgcctcca acttcgtctt tgcaaggact 
2521 atgcccgcag aaggagcaga tgacattcct gggccagtga cctgggagcc aaggcctgaa 
2581 aactccatct ttttaaagtg gccggaacct gagaatccca atggattc£it tctaatgtat 
2641 gaaataaaat acggatcaca agttgaggat cagcgagaat gtgtgtccag acaggaatac 
2701 aggaagtatg gaggggccaa gctaaaccgg ctaaacccgg ggaactacac agcccggatt 
2761 caggccacat ctctctctgg gaatgggtcg tggacagatc ctgtgttctt ctatgtccag 
2821 gccaaaacag gatatgaaaa cttcatccat ctgatcatcg ctctgcccgt cgctgtcctg 
2881 ttgatcgtgg gagggttggt gattatgctg tacgtcttcc atagaaagag aaataacagc 
2941 aggctgggga atggagtgct gtatgcctct gtgaacccgg agtacttcag cgctgctgat 
3001 gtgtacgttc ctgatgagtg ggaggtggct cgggagaaga tcaccatgag ccgggaactt 
3061 gggcaggggt cgtttgggat ggtctatgaa ggagttgcca agggtgtggt gaaagatgaa 
3121 cctgaaacca gagtggccat taaaacagtg aacgaggccg caagcatgcg tgagaggatt 
3181 gagtttctca acgaagcttc tgtgatgaag gagttcaatt gtcaccatgt ggtgcgattg^ 
3241 ctgggtgtgg tgtcccaagg ccagccaaca ctggtcatca tggaactgat gacacggggc * 
3301 gatctcaaaa gttatctccg gtctctgagg ccagaaatgg agaataatcc agtcctagca 
3361 cctccaagcc tgagcaagat gattcagatg gccggagaga ttgcagacgg catggcatac 
3421 ctcaacgcca ataagttcgt ccacagagac cttgctgccc ggaattgcat ggtagccgaa 
3481 gatttcacag tcaaaatcgg agattttggt atgacgcgag atatctatga gacagactat 
3541 taccggaaag gaggcaaagg gctgctgccc gtgcgctgga tgtctcctga gtccctcaag 
3601 gatggagtct tcaccactta ctcggacgtc tggtccttcg gggtcgtcct ctgggagatc 
3661 gccacactgg ccgagcagcc ctaccagggc ttgtccaacg agcaagtcct tcgcttcgtc 
3721 atggagggcg gccttctgga caagccagac aactgtcctg acatgctgtt tgaactgatg 
3781 cgcatgtgct ggcagtataa ccccaagatg aggccttcct tcctggagat catcagcagc 
3841 atcaaagagg agatggagcc tggcttccgg gaggtctcct tctactacag cgaggagaac 
3901 aagctgcccg agccggagga gctggacctg gagccagaga acatggagag cgtccccctg 
3961 gacccctcgg cctcctcgtc ctccctgcca ctgcccgaca gacactcagg acacaaggcc 
4021 gagaacggcc ccggccctgg ggtgctggtc ctccgcgcca gcttcgacga gagacagcct 
4081 tacgcccaca tgaacggggg ccgcaagaac gagcgggcct tgccgctgcc ccagtcttcg 
4141 acctgctgat ccttggatcc tgaatctgtg caaacagtaa cgtgtgcgca cgcgcagcgg 
4201 ggtggggggg gagagagagt tttaacaatc cattcacaag cctcctgtac ctcagtggat 
4261 cttcagttct gcccttgctg cccgcgggag acagcttctc tgcagtaaaa cacatttggg 
4321 atgttccttt tttcaatatg caagcagctt tttattccct gcccaaaccc ttaactgaca 
4381 tgggccttta agaaccttaa tgacaacact taatagcaac agagcacttg agaaccagtc 
4441 tcctcactct gtccctgtcc ttccctgttc tccctttctc tctcctctct gcttcataac 
4501 ggaaaaataa ttgccacaag tccagctggg aagccctttt tatcagtttg aggaagtggc 
4561 tgtccctgtg gccccatcca accactgtac acacccgcct gacaccgtgg gtcattacaa 
4621 aaaaacacgt ggagatggaa atttttacct ttatctttca cctttctagg gacatgaaat 
4681 ttacaaaggg ccatcgttca tccaaggctg ttaccatttt aacgctgcct aattttgcca 
4741 aaatcctgaa ctttctccct catcggcccg gcgctgattc ctcgtgtccg gaggcatggg 
4801 tgagcatggc agctggttgc tccatttgag agacacgctg gcgacacact ccgtccatcc 
4861 gactgcccct gctgtgctgc tcaaggccac aggcacacag gtctcattgc ttctgactag 
4921 attattattt gggggaactg gacacaatag gtctttctct cagtgaaggt ggggagaagc 
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LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 



JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 



Sanchez-Pulido, L., Lozano, J.J,, Paciucci, 
Harvey, C, Bercovich, B., Loukili, N., 
.L,, Sanz, F, , Estivill, X., Valencia, A. 



NM_003349 2394 bp mRNA linear PRI 21-SEP-2001 

Homo sapiens ubiqui tin -conjugating enzyme E2 variant 1 (UBE2V1) , 
transcript variant 2, mRNA. 
NM_003349 

NM_003349.3 GI:15718757 IJBE2 VI 

human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 23 94) 
Rothofsky,M.L. and Lin,S.L. 

CROC-1 encodes a protein which mediates transcriptional activation 

of the human FOS promoter 

Gene 195 (2) , 141-149 (1997) 

97449289 

93057S8 

2 (bases 1 to 2394) 
Sancho, E. , Vila, M. R. 
R., Nadal, M., Fox, M. 
Ciechanover, A., Lin, 
and Thomson, T.M. 
Role of UEV-1, an inactive variant of the E2 ubiqui tin-conjugating 
enzymes, in in vitro differentiation and cell cycle behavior of 
HT-29-M6 intestinal mucosecretory cells 

Mol. Cell. Biol. 18 (1), 576-589 (1998) 

98078713 

9418904 

3 (bases 1 to 2394) 

Ma,L., Broomf ield,S. , Lavery,C, Lin,S.L., Xiao,W, and Bacchetti,S. 
Up-regulation of CIR1/CR0C1 expression upon cell immortalization 
and in tumor-derived human cell lines 
Oncogene 17 (10), 1321-1326 (1998) 

98442973 
9771976 

4 (bases 1 to 2394) 
Hoftnann,R.M. and Pickart,C.M. 

Noncanonical MMS2 -encoded ubiquit in -conjugating enzyme functions in 

assembly of novel polyubiquitin chains for DNA repair 

Cell 96 (5), 645-653 (1999) 

99189750 

10089880 

5 (bases 1 to 2394) 

Deng,L., Wang,C, Spencer, E., Yang,L., Braun,A., You, J., 
Slaughter, C. , Pickart,C. and Chen, 2. J. 

Activation of the IkappaB kinase complex by TRAF6 requires a 
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REFERENCE 
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TITLE 

JOURNAL 
MEDLINE 
PUBMED 
COMMENT 



FEATURES 

source 



gene 



CDS 



FIG. 10(2/4) 



10/510905 



misc feature 



dimeric ubiquitin-conjugating enzyme complex and~a"unique 

polyubiquitin chain 

Cell 103 (2), 351-361 (2000) 

20S09589 

11057907 

6 (bases l to 2394) 

Thomson, T. M. , Lozano,J.J., Loukili,N., Carrio,R., Serras,F., 
Cormand,B., Valeri,M., Diaz, v. M. , Abril,J., Burset,M., Merino, J. , 
Macaya,A., Corominas,M. and Guigo,R. 

Fusion of the human gene for the polyubiquitination coeffector UEV1 

with Kua, a newly identified gene 

Genome Res. 10 (11), 1743-1756 (2000) 

20530912 

11076860 

REVIEWED REFSEO: This record has been curated by NCBI staff. The 
reference sequence was derived from U39361 . 1 , AL110132.1 . 
On Sep 21, 2001 this sequence version replaced gi : 12025659 . 
Summary: Ubiquitin-conjugating enzyme E2 variant proteins 
constitute a distinct subfamily within the E2 protein family. They 
have sequence similarity to other ubiquitin-conjugating enzymes but 
lack the conserved cysteine residue that is critical for the 
catalytic activity of E2s. The protein encoded by this gene is 
located in the nucleus and can cause transcriptional activation of 
the human FOS proto -oncogene . It is thought to be involved in the 
control of differentiation by altering cell cycle behaviour. 
Multiple alternatively spliced transcripts encoding different 
isoforms have been described for this gene. 

Transcript Variant: This variant (2) encodes the longest isoform 
(b) of this protein. 

COMPLETENESS: complete on the 3* end. 
Loca t ion/Qua 1 i f i e rs 
1..2394 

/ organism= "Homo sapiens " 

/ db_xre f ="t axon : 9 6 0 6 " 

/ chr omo s ome «= w 2 0 " 

/map s ="20ql3.2 w 

1. .2394 

/gene= n UBE2Vl" 

/note=*CIRl; UEV-1; UEV1; UEV1A; CROC-1; CR0C1" 

/db xref«="LocusID: 7335 w 

/db_xref= "MIM : 602995 " 

70. ,735 

/gene="UBE2Vl" 

/note= n isoform b is encoded by transcript variant 2; 

DNA-binding protein" 

/codon_start=l 

/db xref="LocusID: 7335 w 

/db_xref ="MIM : 602995 " 

/product="ubiqui tin-conjugating enzyme E2 variant 1, 
isoform b" 

/protein id=" NP 003340.1 " 
/db_xref="GI :4507795" 

/ 1 r ans 1 a t i on= " MAYKFRTHS PEALEQLYPWECFVFCL 1 1 FGTFTNQ I HKWSHT YF 

GLPRWVTLLQDWHVILPRKHHRIHHVSPHETYFCITTGVKVPRNFRLLEELEEGQKGV 

GDGTVSWGLEDDEDMTLTRWTGMIIGPPRTIYENRIYSLKIECGPKYPEAPPFVRFVT 

KINMNGVNSSNGVvT>PRAISVTAKWQNSYSIKvVLQELRRLMMSKiaW 

YSN" 

334 . .714 

/note="UBCc; Region: Ubiquitin-conjugating enzyme E2 , 
catalytic domain homologues" 
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10/5109 



misc feature 



misc feature 



variation 



variation 



variation 



variation 



polyAs ignal 
polyA site 

variation 



variation 



variation 



polyA signal 



337. .555 

/note ="UQ — con; Region: Ubiqui tin -conjugating enzyme. 
Proteins destined for proteasome -mediated degradation may 
be ubiquitinated. Ubiquitination follows conjugation of 
ubiquitin to a conserved cysteine residue of UBC 
homologues. TSG101 is one of several UBC homologues that 
lacks this active site cysteine" 
643 . .714 

/note=" Region: DNA-binding domain" 
1117 

/allele="C" 

/allele="T" 

/ dbxre f = N dbSNP : 6585 " 

1257 

/allele="C" 
/allele="T" 

/dbxre f = " dbSNP : 1049679 » 

complement (1968) 

/allele="A w 

/allele="G" 

/db_xr e f = " dbSNP : 2733 " 

2017 

/allele=«A" 

/allele="C" 

/dbxref ="dbSNP: 15218 " 

2112. .2117 

2135 

/evidence=experimental 
complement (2179) 
/allele-"G M 
/allele^T" 

/db_xre f = "dbSNP : 2664563 " 
2249 

/note = "WARNING : map location ambiguous" 

/allele="A" 

/allele="T" 

/ db__xre f = " dbSNP : 1049871 " 
complement (2259) 
/allele= n A w 
/allele=-G" 

/db_xref~«dbSNP: 2664532" 
2350. .2355 

/evidence=experimental 
2373 

a 605 c 481 g 650 t 



polyA site 
BASE COUNT 658 
ORIGIN 

1 ttcacacggc acgacttcat cgagaccaac ggggacaact gcctggtgac actgctgccg 

61 ctgctaaaca tggcctacaa gttccgcacc cacagccctg aagccctgga gcagctatac 

121 ccctgggagt gcttcgtctt ctgcctgatc atcttcggca ccttcaccaa ccagatccac 

181 aagtggtcgc acacgtactt tgggctgcca cgctgggtca ccctcctgca ggactggcat 

241 gtcatcctgc cacgtaaaca ccatcgcatc caccacgtct caccccacga gacctacttc 

301 tgcatcacca caggagtaaa agtccctcgc aatttccgac tgttggaaga actcgaagaa 

3 61 ggccagaaag gagtaggaga tggcacagtt agctggggtc tagaagatga cgaagacatg 
421 acacttacaa gatggacagg gatgataatt gggcctccaa gaacaattta tgaaaaccga 

4 81 atatacagcc ttaaaataga atgtggacct aaatacccag aagcaccccc ctttgtaaga 
541 tttgtaacaa aaattaatat gaatggagta aatagttcta atggagtggt ggacccaaga 
601 gccatatcag tgctagcaaa atggcagaat tcatatagca tcaaagttgt cctgcaagag 
661 cttcggcgcc taatgatgtc taaagaaaat atgaaactcc ctcagccgcc cgaaggacag 
721 tgttacagca attaatcaaa aagaaaaacc acaggccctt ccccttcccc ccaattcgat 
781 ttaatcagtc ttcattttcc acagtagtaa attttctaga tacgtcttgt agacctcaaa 
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841 
901 
961 
1021 
1081 
1141 
1201 
1261 
1321 
1381 
1441 
1501 
1561 
1621 
1681 
1741 
1801 
1861 
1921 
1981 
2041 
2101 
2161 
2221 
2281 
2341 



gtaccggaaa 

aattttttgt 

accactgtcc 

cataactggt 

tgctgattac 

cctgcaacaa 

cgcagcctgt 

taagtcttaa 

cccttggccg 

gaattcaggg 

tctccctcct 

ccttgctttg 

agttttggaa 

gccctgccac 

gtgaaaagaa 

gaacttggag 

ttactttcct 

tttggtcatc 

aaccttgaat 

cagggattaa 

cctggcaaat 

atgcataccg 

tagtatttgt 

gtttccatta 

gtgaagttta 

ttttaatata 



ggaagctccc 
ccatttgaaa 
acgtagttga 
ggggcacatc 
acggcctggg 
cagccctcta 
gggactactg 
gtgatgcccc 
aagcatagat 
ctttccccat 
cctcaagttc 
gccagaagcc 
ccatactcac 
cctctgctgt 
tagtcaccag 
aaagaccgca 
gtaactgctt 
cttgcaatcc 
ccggtgcatg 
gaaggaaccc 
tgctgcgtct 
aaataaaagc 
gtaaaaccac 
atctttttct 
acataacagt 
ataaaaaaaa 



attcaaagga 

tatataagtt 

acttctggga 

taactcaact 

gtctctgcct 

gcctgggggg 

ctaggtgtgt 

ttccaaacca 

tgtaacccct 

atcttctctc 

ctttttgcac 

atcaggtaag 

tcactctcca 

catcagctga 

ggttactcag 

tgaagatact 

ttgcttttaa 

attggggtct 

ccttggtttt 

ggtgtgcaca 

ttccacttgc 

aattcattgt 

cttttgaagc 

ggggggaaaa 

attccataag 

agtgtgcgtt 



aatttatctt 

gtgctataac 

tcaagaaagt 

gtgaaaagac 

tctcccttta 

cttgttagag 

ggggtgtttc 

tcatcctgtc 

ccactcccct 

cccccacctt 

cgtcaccacc 

gttggaaaga 

ccagcctggg 

tgcattgttt 

acctgccagc 

tgtaagcaca 

aaattgaaga 

agtttggaat 

ggtgctgctg 

gcagatcccc 

tgttcaggac 

gtactaaagg 

agcaactatc 

ccttagttct 

cagccttttt 

aataaaaaaa 



aagatactgt 
aaatcatcct 
ctatttaaat 
acatcacaca 
ccctcccgcc 
tagatgtgaa 
gcctgcaccc 
cccacgctcc 
ctgagattgg 
tatcgagggg 
caacaccttc 
gcctctgacc 
aaatgaatat 
ttagctcagg 
tctcggagtc 
catgatccct 
agtbttaaac 
ctgacaactg 
ctgcttccca 
gaaattggtg 
cactaaat^c 
tttttttttt 
aagtctgaaa 
aaggatttaa 
attgtcagac 
aaaaaaaaaa 



aaatgatact 

gtcaagtgta 

tgattcccat 

atcaccttgc 

tcccaccctc 

ggtttcaggt 

ctggttcctt 

tccactcccg 

cttcggtgag 

tgctgctttt 

catgacactt 

tcccttgttt 

tgggtcctca 

ttttgataag 

cttggtggtt 

ctgaattgtt 

agggctttca 

gaacaaaaag 

agatcctcag 

ggcttgacct 

tgaaatgtgg 

ttttttaatt 

agcaattgat 

catcctgtaa 

cattgcctga 

aaaa 
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LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITIiE 
JOURNAL 
MEDLINE 
PUBMED 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 
TITLE 



NM_000689 1506 bp mRNA linear PRI 31-OCT-2000 

Homo sapiens aldehyde dehydrogenase 1, soluble (ALDH1) , mRNA. 
NM_000689 

NM 000689.1 GI:4502030 



ALDEHYDE DEHYDROGENASE 



human . 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 1506) 

Hsu LC, Tani K, Fujiyoshi T, Kurachi K and Yoshida A. 
Cloning of cDNAs for human aldehyde dehydrogenases 1 and 2 
Proc. Natl. Acad. Sci. U.S.A. 82 (11), 3771-3775 (1985) 
85216574 
2987944 

2 (bases 1 to 1506) 

Raghunathan L, Hsu LC, Klisak I, Sparkes RS, Yoshida A and Mohandas 
T. 

Regional localization of the human genes for aldehyde 

dehydrogenase -1 and aldehyde dehydrogenase - 2 

Genomics 2 (3), 267-269 (1988) 

88284707 

3397064 

3 (bases 1 to 1506) 

Hsu LC, Chang WC and Yoshida A. 

Genomic structure of the human cytosolic aldehyde dehydrogenase 
gene 

Genomics 5 (4), 857-865 (1989) 

90077427 

2591967 

4 (bases 1 to 1506) 

Pereira F, Rosenmann E, Nylen E, Kaufman M, Pinsky L and Wrogemann 
K. 

The 56 kDa androgen binding protein is an aldehyde dehydrogenase 
Biochem. Biophys . Res. Commun. 175 (3), 831-838 (1991) 
91222190 
1709013 

5 (bases 1 to 1506) 

Zheng, C.F., Wang,T.T. and Weiner,H. 

Cloning and expression of the full-length cDNAS encoding human 
liver class 1 and class 2 aldehyde dehydrogenase 
Alcohol. Clin. Exp. Res. 17 (4), 828-831 (1993) 
94027752 

6 (bases 1 to 1506) 
Kathmann, E.C. and Lipsky,J.J. 

Cloning of a cDNA encoding a cons ti tut ively expressed rat liver 
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FEATURES 

source 



(2), 527-531 (1997) 



gene 



misc feature 



variation 



variation 



variation 



cytosolic aldehyde dehydrogenase 
Biochem. Biophys. Res. Commun. 236 
97382470 

PROVISIONAL REFSEO; This record has not yet been subject to final 
NCBI review. The reference sequence was derived from AF003341 . 1 . 

Locat ion/Oual i f iers 

1. . 1506 

/organism^ "Homo sapiens" 

/ db_x r e f = " t a xon : 9 6 0 6 " 

/ chr omo s ome = " 9 " 

/map="9q21" 

/tissue_type=" liver" 

1. . 1506 

/ gene= " ALDH1 " 

/note="PUMBl M 

/db_xre f « " Locus ID : 216 " 

/db_xref = "MIM : 100640 " 

1- .1506 

/gene="ALDHl" 

/EC_number*= "1.2.1.3" 

/note="cytosolic protein; class 1" 

/codon__start=l 

/db_xref = " Locus ID : 216" 

/db_xref = "MIM : 100640 " 

/products "aldehyde dehydrogenase 1, soluble" 
/protein_id=" NP 000680.1 " 
/db_xref="GI -.4502031" 

/ 1 r ans la t ion- « MSS SGTPDLPVLLTDLKIQYTKI F INNEWHDS VSGKKFPVFNPA 
TEEELCQVEEGDKEDVDKAVKAARQ AFQI GS PWRTMD AS ERGRLL Y KLADL I ERDRLL 
LATMESMNGGKL YSNAYLSDLAGCI KTLR YCAGWADKIQGRTI PIDGNFFTYTRHEP I 
GVCGQIIPWNFPLVT4LIWKIGPALSTONTVVVKPAEQTPLTALHVASLIKEAGFPPGV 
VNI VPG YG PTAGAAI SSHMDIDKVAFTGSTEVGKL I KEAAGKSNLKRVTLELGGks PC 
IVLADADLDNAV^FAHHGVFYHQGQCCIAASRIFVEESIYDEFW^ 
PLTPGVTQGPQIDKEQYDKILDLIESGKKEGAKLECGGGPWGNKGYFVQPTVFSNVTD 
EMRIAKEEIFGPVQQIMKFKSLDDVIKRANNTFYGLSAGVFTKDIDKAITISSALQAG 
TVWVNCYGVVSAQCPFGGFKMSGNGRElX3EYGFHEYTEVTCIVTv^ " 
82. .1488 

/note="aldedh; Region: Aldehyde dehydrogenase family" 
362 

/allele="A« 
/allele="G a 

/dbxref s "dbSNP ; 104 998 1 " 
1337 

/allele="A" 
/allele="C" 

/db xref ="dbSNP : 1803054 " 
1397 

/allele= w A" 
/allele="T" 

/db_xref =«dbSNP : 1063447 " 

293 C 391 g 381 t 



BASE COUNT 441 a 

ORIGIN 

1 atgtcatcct caggcacgcc agacttacct gtcctactca 

61 actaagatct tcataaacaa tgaatggcat gattcagtga 

121 tttaatcctg caactgagga ggagctctgc caggtagaag 

181 gacaaggcag tgaaggccgc aagacaggct tttcagattg 

241 gatgcttccg agagggggcg actattatac aagttggctg 

301 ctgctgctgg cgacaatgga gtcaatgaat ggtggaaaac 

361 agtgatttag caggctgcat caaaacattg cgctactgtg caggttgggc tgacaagatc 

421 cagggccgta caataccaat tgatggaaat ttttttacat atacaagaca tgaacctatt 



ccgatttgaa gattcaatat 
gtggcaagaa atttcctgtc 
aaggagataa ggaggatgtt 
gatctccgtg gcgtactatg 
atttaatcga aagagatcgt 
tctattccaa tgcatatctg 
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481 ggtgtatgtg gccaaatcat tccttggaat ttcccgttgg ttatgctcat ttggaagata 
S41 gggcctgcac tgagctgtgg aaacacagtg gttgtcaaac cagcagagca aactcctctc 
601 actgctctcc acgtggcatc tttaataaaa gaggcagggt ttcctcctgg agtagtgaat 
661 attgttcctg gttatgggcc tacagcaggg gcagccattt cttctcacat ggatatagac 
721 aaagtagcct tcacaggatc aacagaggtt ggcaagttga tcaaagaagc tgccgggaaa 
781 agcaatctga agagggtgac cctggagctt ggaggaaaga gcccttgcat tgtgttagct 
841 gatgccgact tggacaatgc tgttgaattt gcacaccatg gggtattcta ccaccagggc 
901 cagtgttgta tagccgcatc caggattttt gtggaagaat caatttatga tgagtttgtt 
961 cgaaggagtg ttgagcgggc taagaagtat atccttggaa atcctctgac cccaggagtc 
1021 actcaaggcc ctcagattga caaggaacaa tatgataaaa tacttgacct cattgagagt 
1081 gggaagaaag aaggggccaa actggaatgt ggaggaggcc cgtgggggaa taaaggctac 
1141 tttgtccagc ccacagtgtt ctctaatgtt acagatgaga tgcgcattgc caaagaggag 
1201 atttttggac cagtgcagca aatcatgaag tttaaatctt tagatgacgt gatcaaaaga 
1261 gcaaacaata ctttctatgg cttatcagca ggagtgttta ccaaagacat tgataaagcc 
1321 ataacaatct cctctgctct gcaggcagga acagtgtggg tgaattgcta tggcgtggta 
1381 agtgcccagt gcccctttgg cggattcaag atgtctggaa atggaagaga actgggagag 
1441 tacggtttcc atgaatatac agaggtcaaa acagtcacag tgaaaatctc tcagaagaac 
1501 tcataa 
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XM_037768 2282 bp mRNA linear PRI 07-FEB-2002 

Homo sapiens similar to pyruvate kinase, muscle (H. sapiens) 
(LOC145710) , mRNA. 
XM 037768 

XMI037768.1 GI : 14750404 PYRUVATE KINASE 

human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 2282) 
NCBI Annotation Project. 
Direct Submission 

Submitted (06-FEB-2002) National Center for Biotechnology 
Information, NIH, Bethesda, MD 2 0894, USA 

GENOME ANNOTATION REFSEQ ; This model reference sequence was 
predicted from NCBI contig NT_0 10235 by automated computational 
analysis using gene prediction method: BLAST. -Also see:- 
Documentation of NCBI ' s Annotation Process- Evidence Viewer 1 
alignments supporting this model . 

Location/Qualifiers 

1. -2282 

/ organi sm= " Homo sapiens 
/db_xref =" taxon : 9606 H 
/ chromosome ="15" 
1. .2282 

/gene= ■ LOC14 5710" 

/note= "Located on Accession NT_010235" 
/db_xref= ■ Inter imlD : 145710 " 
109. .1704 
/gene="LOC145710" 

/note="Located on Accession NT_010235" 
/codon_start=l 

/product="similar to pyruvate kinase, muscle (H. sapiens)" 
/protein_id= " XP 037768.1 " 
/db_xref ="GI : 14750405" 

/ trans la tion = "MSKPHSEAGTAFIQTQQLHAAMADTFLEHMCRLDIDSPPITARN 
TGIICTIGPASRSVETLKEMIKSGMNVARLNFSHGTHEYHAETIKNVRTATESFASDP 
I LYRPVAVALDTKG PE I RTGL I KGSGTAE VELKKGATLKI TLDNA YMEKCDENI L WLD 
YKNICKWEVGSKIYVI)DGLISLQVKQKGADFLVTEVBNGGSIX3SKKGvKLPGAAVDL 
PAVSEKDIQDLKFGVEQDVIDMVFASFIRKASDVHEVRKVLGEKGKNIKIISKIENHEG 
VRRFDEILEASDG IMVARGDLG I E I PAEKVFLAQKMMIGRCNRAGKPVICATQMLESM 
IKKPRPTRAEGSDVANAVLDGADCIMLSGETAKGDYPLEAVRMQHLIAREAEAAIYHL 
QLFEELRRLAPITSDPTEATAVGAVEAS FKCCSGAI I VLTKSGRS AHQVAR YRPRAP I 
I AVTRNPQTARQAHLYRG I FPVLCKDPVQEAWAEDVDLRVNFAMNVGKARGFFKKGDV 
VIVLTGWRPGSGFTNTMRWPVP " 
223 . .1293 
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/note= w PK; Region: Pyruvate kinase, barrel domain" 
546 

/allelee"C" 
/allele= M T" 
/db_xref « "dbSNP : 10514 " 
1333 . .1695 

/note="PK_C; Region: Pyruvate kinase, alpha/beta domain" 
2168 

/allele="C" 
/allele="T" 

/db_xre f = " dbSNP : 1062430 " 
BASE COUNT 499 a 646 c 654 g 483 t 

ORIGIN 

1 ggctgaggca gtggctcctt gcacagcagc tgcacgcgcc gtggctccgg atctcttcgt 
61 ctttgcagcg tagcccgagt cggtcagcgc cggaggacct cagcagccat gtcgaagccc 
121 catagtgaag ccgggactgc cttcattcag acccagcagc tgcacgcagc catggctgac 
181 acattcctgg agcacatgtg ccgcctggac attgattcac cacdcatcac agcccggaac 
241 actggcatca tctgtaccat tggcccagct tcccgatcag tggagacgtt gaaggagatg 
,301 attaagtctg gaatgaatgt ggctcgtctg aacttctctc atggaactca tgagtaccat 
361 gcggagacca tcaagaatgt gcgcacagcc acggaaagct ttgcttctga ccccatcctc 
421 taccggcccg ttgctgtggc tctagacact aaaggacctg agatccg£ac tgggctcatc 
481 aagggcagcg gcactgcaga ggtggagctg aagaagggag ccactctcaa aatcacgctg 
541 gataacgcct acatggaaaa gtgtgacgag aacatcctgt ggctggacta caagaacatc 
601 tgcaaggtgg tggaagtggg cagcaagatc tacgtggatg atgggcttat ttctctccag 
661 gtgaagcaga aaggtgccga cttcctggtg acggaggtgg aaaatggtgg ctccttgggc 
721 agcaagaagg gtgtgaacct tcctggggct gctgtggact tgcctgctgt gtcggagaag 
781 gacatccagg atctgaagtt tggggtcgag caggatgttg atatggtgtt tgcgtcattc 
841 atccgcaagg catctgatgt ccatgaagtt aggaaggtcc tgggagagaa gggaaagaac 
901 atcaagatta tcagcaaaat cgagaatcat gagggggttc ggaggtttga tgaaatcctg 
961 gaggccagtg atgggatcat ggtggctcgt ggtgatctag gcattgagat tcctgcagag 
1021 aaggtcttcc ttgctcagaa gatgatgatt ggacggtgca accgagctgg gaagcctgtc, 
1081 atctgtgcta ctcagatgct ggagagcatg atcaagaagc cccgccccac tcgggctgaa 
1141 ggcagtgatg tggccaatgc agtcctggat ggagccgact gcatcatgct gtctggagaa 
1201 acagccaaag gggactatcc tctggaggct gtgcgcatgc agcacctgat tgcccgtgag 
1261 gcagaggctg ccatctacca cttgcaatta tttgaggaac tccgccgcct ggcgcccatt 
1321 accagcgacc ccacagaagc caccgccgtg ggtgccgtgg aggcctcctt caagtgctgc 
1381 agtggggcca taatcgtcct caccaagtct ggcaggtctg ctcaccaggt ggccagatac 
1441 cgcccacgtg cccccatcat tgctgtgacc cggaatcccc agacagctcg tcaggcccac 
1501 ctgtaccgtg gcatcttccc tgtgctgtgc aaggacccag tccaggaggc ctgggctgag 
1561 gacgtggacc tccgggtgaa ctttgccatg aatgttggca aggcccgagg cttcttcaag 
1621 aagggagatg tggtcattgt gctgaccgga tggcgccctg gctccggctt caccaacacc 
1681 atgcgtgttg ttcctgtgcc gtgatggacc ccagagcccc tcctccagcc cctgtcccac 
1741 ccccttcccc cagcccatcc attaggccag caacgcttgt agaactcact ctgggctgta 
1801 acgtggcact ggtaggttgg gacaccaggg aagaagatca acgcctcact gaaacatggc 
1861 tgtgtttgca gcctgctcta gtgggacagc ccagagcctg gctgcccatc atgtggcccc 
1921 acccaatcaa gggaagaagg aggaatgctg gactggaggc ccctggagcc agatggcaag 
1981 agggtgacag cttcctttcc tgtgtgtact ctgtccagtt cctttagaaa aaatggatgc 
2041 ccagaggact cccaaccctg gcttggggtc aagaaacagc cagcaagagt taggggcctt 
2101 agggcactgg gctgttgttc cattgaagcc gactctggcc ctggccctta cttgcttctc 
2161 tagctctcta ggcctctcca gtttgcacct gtccccaccc tccactcagc tgtcctgcag 
2221 caaacactcc accctccacc ttccattttc ccccactact gcagcacctc caggcctgtt 
2281 gc 
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DEFINITION 
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VERSION 
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SOURCE 
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REFERENCE 
AUTHORS 
TITLE 
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source 



XM_049337 2631 bp mRNA linear PRI 07-FEB-2002 

Homo sapiens glucose -6 -phosphate dehydrogenase (G6PD) , mRNA. 
XM_04 93 37 

XM_049337.1 GI: 14768486 
human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrate; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases l to 2631) 
NCBI Annotation Project. 
Direct Submission 

Submitted (06-FEB-2002) National Center for Biotechnology 
Information, NIH, Bethesda, MD 20894, USA 

GENOME ANNOTATION REFSEO : This model reference sequence was 
predicted from NCBI contig NT_025 965 by automated computational 
analysis using gene prediction method: BLAST. -Also see:- 
Documentation of NCBI's Annotation Process- Evidence Viewer - 
alignments supporting this model. 
Location/Qualifiers 
1. .2631 

/organism^ "Homo sapiens" 
/db_xref="taxon:9606" 
/chromosome = " X M 
1. .2631 
/gene="G6PD" 

/note- "G6PD1; Located on Accession NT_025965" 
/ db_xr e f = " Locus ID : 2539 M 
/db_xre f= n MIM: 305900 " 

475.. 2022 

/gene«"G6PD" 

/note=" Located on Accession NT_025965" 
/codon_start=l 

/product^ "glucose- 6 -phosphate dehydrogenase" 
/protein id=" XP 049337 . 1 " 
/db_xref = "GI : 14768487 « 

/ trans lation="MAEQVALSRTQVCGILREELFQGDAFHQSDTHIFIIMGASGDLA 
KKKIYPTIWWLFRDGLLPENTFIMGYARSRLTVADIRKQSEPFFKATPEEKLKLEDFF 
ARNSWAGQYDDAASYQRLNSHMDALHLGSQANRLFYLALPPTVYEAVTKNIHESCMS 
QIGWNRIIVEKPFGRDLQSSDRLSNHISSLFREDQIYRIDHYLGKEMVQNLMVLRFAN 
RIFGPIWNRDNIACVILTFKEPFGTEGRGGYFDEFGIIRDVMQNHLLQMLCLVAMEKP 
ASTNSDDVRDEKVKVLKCISEVQANNVVliGQYVGNPDGEGEATKGYLDDPTVPRGSTT 
ATFAAWL YVENERWDGVP FI LRCGKALNERKAEVRIiQFHDVAGD I FHQQCKRNELVI 
RVQPNEAVYTKMMTKKPGMFFNPEESELDLTYGNRYKNVKLPDAYERLILDVFCGSQM 
HFVRSDELREAWRIFTPLLHQIELEKPKPIPYIYGSRGPTEADELMKRVGFQYEGTYK 
WVNPHKL" 
variation 507 
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/allele* 
/allele* 

/db_xre f « « dbSNP : 1050827 " 
553 . .1104 

/note=r H G6PD; Region: Glucose -6 -phosphate dehydrogenase , 
NAD binding domain" 
676 

/allele="A" 
/allele«"G" 

/db_xre f « ■ dbSNP : 1050828 " 
850 

/allele= M A M 
/allele«"G" 

/db_xref ="dbSNP : 1050829 " 
1108. .1992 

/note="G6PD_C; Region: Glucose -6 -phosphate dehydrogenase, 
C- terminal domain" 
2379 

/allele="A" 
/allele=«G" 

/db_xr e f = " dbSNP : 1050757 " 
2392 

/allele="A" 
/allele="G" 

/db_xref = "dbSNP : 1063529 " 
2490 

/allele="A" 
/allele="G" 

/db_xref = "dbSNP : 1050812. " 
2553 

/allele= M C M 
/allele=:"T" 

/db_xre f = " dbSNP : 1050773 " 
2555 

/allele^C" 
/allele="T" 

/ db__xr e f = " dbSNP : 1050774 " 
2593 
/allele="C M 
/allele="T w 

/db_xref = "dbSNP : 1050831 " 
BASE COUNT 527 a 884 c 797 g 423 t 

ORIGIN 

1 agggacagcc cagaggaggc gtggccacgc tgccggcgga agtggagccc tccgcgagcg 
61 cgcgaggccg ccggggcagg cggggaaacc ggacagtagg ggcggggccg ggccggcgat 
121 ggggatgcgg gagcactacg cggagctgca cccgtgcccg ccggaattgg ggatgcagag 
181 cagcggcagc gggtatggca ggcagccggc gggccggcct ccagcgcagg tgcccgagag 
241 gcaggggctg gcctgggatg cgcgcgcacc tgccctcgcc ccgccccgcc cgcacgaggg 
301 gtggtggccg aggccccgcc ccgcacgcct cgcctgaggc gggtccgctc agcccaggcg 
361 cccgcccccg cccccgccga ttaaatgggc cggcggggct cagcccccgg aaacggtcgt 
421 acacttcggg gctgcgagcg cggagggcga cgacgacgaa gcgcagacag cgtcatggca 
481 gagcaggtgg ccctgagccg gacccaggtg tgcgggatcc tgcgggaaga gcttttccag 
541 ggcgatgcct tccatcagtc ggatacacac atattcatca tcatgggtgc atcgggtgac 
601 ctggccaaga agaagatcta ccccaccatc tggtggctgt tccgggatgg ccttctgccc 
661 gaaaacacct tcatcatggg ctatgcccgt tcccgcctca cagtggctga catccgcaaa 
721 cagagtgagc ccttcttcaa ggccacccca gaggagaagc tcaagctgga ggacttcttt 
781 gcccgcaact cctatgtggc tggccagtac gatgatgcag cctcctacca gcgcctcaac 
841 agccacatgg atgccctcca cctggggtca caggccaacc gcctcttcta cctggccttg 
901 cccccgaccg tctacgaggc cgtcaccaag aacattcacg agtcctgcat gagccagata 
961 ggctggaacc gcatcatcgt ggagaagccc ttcgggaggg acctgcagag ctctgaccgg 
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ctgtccaacc 
ctgggcaagg 
cccatctgga 
actgagggtc 
cacctactgc 
gacgtccgtg 
gtggtcctgg 
ctggacgacc 
tatgtggaga 
aacgagcgca 
cagcagtgca 
aagatgatga 
acctacggca 
gacgtcttct 
cgtattttca 
atttatggca 
tatgagggca 
cacccccgcc 
accattgacc 
gctgctgcta 
tggcccctcc 
ccaacagaag 
tctcactcct 
acccacgtga 
cttgtcacca 
catggccacc 
gctgggaccc 



acatctcctc 
agatggtgca 
accgggacaa 
gcgggggcta 
agatgctgtg 
atgagaaggt 
gccagtacgt 
ccacggtgcc 
atgagaggtg 
aggccgaggt 
agcgcaacga 
ccaagaagcc 
acagatacaa 
gcgggagcca 
ccccactgct 
gccgaggccc 
cctacaagtg 
acggccaccc 
tcagctgcac 
ctacccgagc 
agaccctgcc 
gaaggaggag 
gagtggggcc 
gagaatctgc 
gcaacatctc 
ccgtgccacc 
ctcccaacct 



cctgttccgt 
gaacctcatg 
catcgcctgc 
tttcgatgaa 
tctggtggcc 
caaggtgttg 
ggggaacccc 
ccgcgggtcc 
ggatggggtg 
gaggctgcag 
gctggtgatc 
gggcatgttc 
gaacgtgaag 
gatgcacttc 
gcaccagatt 
cacggaggca 
ggtgaacccc 
tccttcccgc 
attcctggcc 
ccagctacat 
tgagcccagg 
ggcgcccatt 
agggtgggag 
ctgtggcctt 
gagccccctg 
cgtaggcagc 
caatgccctg 



gaggaccaga 
gtgctgagat 
gttatcctca 
tttgggatca 
atggagaagc 
aaatgcatct 
gatggagagg 
accaccgcca 
cccttcatcc 
ttccatgatg 
cgcgtgcagc 
ttcaaccccg 
ctccctgacg 
gtgcgcagcg 
gagctggaga 
gacgagctga 
cacaagctct 
cgcccgaccc 
ccgggctctg 
tcctcagctg 
agctgagtca 
cgtctgtccc 
ggagggacaa 
gcccgccagc 
gatgtcccct 
ctctctgcta 
ccattaaatc 



tctaccgcat 
ttgccaacag 
ccttcaagga 
tccgggacgt 
ccgcctccac 
cagaggtgca 
gcgaggccac 
cttttgcagc 
tgcgctgcgg 
tggccggcga 
ccaacgaggc 
aggagtcgga 
cctacgagcg 
acgagctccg 
agcccaagcc 
tgaagagagt 
gagecctggg 
cgagtcggga 
gccaccctgg 
ccaagcactc 
cctcctctfac 
agagcttatt 
gggggaggaa 
ctcagtgcca 
gtcccaccaa 
taagaaaagc 
cgcaaacagc 



cgaccactac 
gatcttcggc 
gccctttggc 
gatgcagaac 
caactcagat 
ggccaacaat 
caaagggtac 
cgtcgtcctc 
caaggccctg 
catcttccac 
cgtgtacacc 
gctggacctg 
cctcatcctg 
tgaggcctgg 
catcccctat 
gggtttccag 
cacccacctc 
ggactccggg 
cccgcccctc 
gagaccatcc 
tcactccagc 
ggccactggg 
aggggcgagc 
cttgacattc 
ctctgcactc 
agacgcagca 
c 
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LOCUS 

DEFINITION 
ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



XM_04904 7 1564 bp mRNA 

Homo sapiens proliferation-associated 2G4, 
XM__049047 

XM 049047.1 GI:14759750 



linear PRI 
3 8kD (PA2G4) , 



HCDR-3 



16-JUL-2001 
mRNA. 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



gene 



CDS 



human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 1564) 
NCBI Annotation Project. 
Direct Submission 

Submitted (12-JUL-2001) National Center for Biotechnology 
Information, NIH, Betheeda, MD 208 94, USA 
Location/Qualifiers 
1. .1564 

/organism="Homo sapiens" 
/db_xref = " taxon : 9606 M 
/ chromo s ome ="12" 
1. .1564 
/gene= M PA2G4" 
/db_xref = " Locus ID : 5036 " 
/db_xref «= n MIM : 602145 " 
120.. 1304 
/gene="PA2G4 n 
/codon_start=l 

/product="prolif eration-associated 2G4, 38kD" 
/protein_ id= " XP 049047.1 " 
/db_xref«="GI: 14759751" 

/translations " MSGEDEQQEQTIAEDLWTKYKMGGDIANRVLRSLVEASSSGVS 
VLSLCEKGDAMIMEETGKIFKKEKEMKKGIAFPTSISVNNCVCHFSPLKSDQDYILKE 
GDLVKI DLGVHVDG F I ANVAHTFWD VAQGTQVTGRKADVI KAAHLCAEAALRLVKPG 
NQNTQVTEAWNKVAHS FNCTP I EGMLSHQLKQHVIDGEKT 1 1 QNPTDQQKKDHEKAE F 
EVHEVYAVDVLVSSGEGKAKDAGQRTTIYKRDPSKQYGLKMKTSRAFFSEVERRFDAM 
PFTLRAFEDEKKARMGVVECAKHELLQPFNVLY^KEGEFVAQFKFTVLLMPNGPMRIT 
SGPFEPDLYKSEMEVQDAELKALLQSSASRKTQKKKKKKASKTAENATSGETLEENEA 
GD" 

BASE COUNT 455 a 365 C 413 g 331 t 

ORIGIN 

1 ctttcgctcg ccctctcctc gaggatcgag gggactctga ccacagcctg tggctgggaa 
61 gggagacaga ggcggcggcg gctcagggga aacgaggctg cagtggtggt agtaggaaga 
121 tgtcgggcga ggacgagcaa caggagcaaa ctatcgctga ggacctggtc gtgaccaagt 
181 ataagatggg gggcgacatc gccaacaggg tacttcggtc cttggtggaa gcatctagct 
241 caggtgtgtc ggtactgagc ctgtgtgaga aaggtgatgc catgattatg gaagaaacag 
301 ggaaaatctt caagaaagaa aaggaaatga agaaaggtat tgcttttccc accagcattt 
361 cggtaaataa ctgtgtatgt cacttctccc ctttgaagag cgaccaggat tatattctca 
421 aggaaggtga cttggtaaaa attgaccttg gggtccatgt ggatggcttc atcgctaatg 
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tagctcacac 
atgttattaa 
atcagaacac 
caatagaagg 
ttatccagaa 
atgaagtata 
gacagagaac 
cttcacgtgc 
gagcatttga 
tgctgcaacc 
ttacagttct 
acctctacaa 
ctgcaagtcg 
ccaccagtgg 
ccagcttgct 
ttcttctcca 
cccaacccac 
gcttcccaac 
agtc 



ttttgtggtt 
ggcagctcac 
acaagtgaca 
tatgctgtca 
tcccacagac 
tgctgtggat 
cactatttac 
cttcttcagt 
agatgagaag 
atttaatgtt 
gctcatgccc 
gtctgagatg 
aaaaacccag 
ggaaacatta 
gctcctgcct 
cctaggaccg 
tcccttccaa 
cacggaagac 



gatgtagctc 
ctttgtgctg 
gaagcctgga 
caccagttga 
cagcagaaga 
gttctcgtca 
aaacgagacc 
gaggtggaaa 
aaggctcgga 
ctctatgaga 
aatggcccca 
gaggtccagg 
aaaaagaaaa 
gaagaaaatg 
catccccttc 
ccagcagagc 
caacaaccag 
tactttaaat 



aggggaccca 
aagctgccct 
acaaagttgc 
agcagcatgt 
aggaccatga 
gctcaggaga 
cctctaaaca 
ggcgttttga 
tgggtgtggt 
aggagggtga 
tgcggataac 
atgcagagct 
aaaagaaggc 
aagctgggga 
ccaccaaacc 

ggggggatct 

ctccaactga 
gaaaaaaaga 



agbaacaggg 
acgcctggtc 
ccactcattt 
catcgatgga 
aaaagctgaa 
gggcaaggcc 
gtatggactg 
tgccatgccg 
ggagtgcgcc 
atttgttgcc 
cagtggtccc 
aaaggccctc 
ctccaagact 
ctgaggtggg 
ccagactctg 
ccctgccccc 
ctctggtctt 
aattgaataa 



aggaaagcag 
aaacctggaa 
aactgcacgc 
gaaaaaacca 
tttgaggtac 
aaggatgcag 
aaaatgaaaa 
tttactttaa 
aaacatgaac 
cagtttaaat 
ttcgagcctg 
ctccagagtt 
gcagagaatg 
tcccatctcc 
tgaagtgcag 
accccagttc 
gggaggtgag 
taaaatcagg 
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T 1: XM 052326[gi: 14748477] 



LOCUS 

DEFINITION 

ACCESSION 
VERSION 
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SOURCE 
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REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



XM_052326 3273 bp mRNA linear PRI 16-JUL-2001 

Homo sapiens DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 21 
(DDX21) , mRNA. 
XM_052326 

XM_052326.1 GI: 14748477 DDX21 
human. 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 3273) 
NCBI Annotation Project. 
Direct Submission 

Submitted (12-JUL-2001) National Center for Biotechnology 
Information, NIH, Bethesda, MD 20894, USA 
Loca t i on/Qua 1 i f i e r s 
1. .3273 

/organism^ "Homo sapiens" 
/db_xref= " taxon : 9606 " 
/chromosome^ "10" 
1. ,3273 
/gene="DDX21" 
/ notes "GURDB; RH-II/GU" 
/ dbxre f = « Locus I D : 9188 " 
3 5. .1711 
/gene="DDX21" 
/ codon_s t ar t = 1 

/product- "DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 21" 
/protein_ld= w XP 0S2326 .1 " 
/db_xref c"GI : 14748478" 

/ 1 r ans 1 a t ion= w MPGKLRSDAGLESDTAMKKGETLRKQTEE KEKKEKPKSDKTEE I 
AEEEETVFPKAKQ VKKKAEPS E VDMNS PKS KKAKKKEE PSQND I S PKTKS LRKKKE P I 
EKKWSSKTKKVTKNEEPSEEEIDAPKPKKMKKEKEMNGETREKSPKLKNGFP 
CNPSEAASEESNSEIEQEIPVEQKEGAFSNFPISEETIKLLKGRGVTFLFPIQAKTFH 
HVYSGKDLI AQARTGTGKTFS FA I PLI E KLHGELQDRKRGRAPQVL VLAPTRELANQV 
SKDFSDITKKLSVACFYGGTPYGGQFERMRNGIDILVGTPGRIKDHIQNGKLDLTKLK 
HVVLDEVDQMLDMGFADQVEEILSVAYKKDSEDNPQTLLFSATCPHWVFNVAKKYMKS 
TYEQVDLIGKKTQKTAITVEHIAIKCHWTQRAAVIGDVIRVYSGHQGRTIIFCETKKE 
AQELSQNS AI KQDAQS LHGDI PQKQRE ITLKGFRNGS FGVLVATNVAARGLD I PEVDL 
VIQSS PPKGCRVLHSS IRADRQS WKDGGVHLLLS AQGRI S VSTSGAKSGN " 

BASE COUNT 1068 a 603 c 773 g 829 t 

ORIGIN 

1 gaagaccggt cggcctgggc aacctgcgct gaagatgccg ggaaaactcc gtagtgacgc 
61 tggtttggaa tcagacaccg caatgaaaaa aggggagaca ctgcgaaagc aaaccgagga 
121 gaaagagaaa aaagagaagc caaaatctga taagactgaa gagatagcag aagaggaaga 
181 aactgttttc cccaaagcta aacaagttaa aaagaaagca gagccttctg aagttgacat 
241 gaattctcct aaatccaaaa aggcaaaaaa gaaagaggag ccatctcaaa atgacatttc 
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301 
361 
421 
481 
541 
601 
661 
721 
781 
841 
901 
961 
1021 
1081 
1141 
1201 
1261 
1321 
1381 
1441 
1501 
1561 
1621 
1681 
1741 
1801 
1861 
1921 
1981 
2041 
2101 
2161 
2221 
2281 
2341 
2401 
2461 
2521 
2581 
2641 
2701 
2761 
2821 
2681 
2941 
3001 
3061 
3121 
3181 
3241 



// 



tcctaaaacc 
taaaaccaaa 
gcccaagaag 
actgaagaat 
agaaagtaac 
taattttccc 
atttcctata 
ggcacggaca 
tggggaactg 
aagagagttg 
ggcttgtttt 
tgatatcctg 
tctcaccaaa 
tgctgatcaa 
ccaaacattg 
catgaaatct 
aactgtggag 
tgtcatccga 
agaagcccag 
agacattcca 
agttttggtg 
tatacaaagc 
gcagagctgg 
tagtacaagt 
cagaaataat 
ctgccattag 
aagctctggc 
tgatcaactc 
atattagtta 
tgaagggaat 
cagtaacaga 
cagagcaacc 
aaggcagtcg 
aaggcagtag 
aaaacaaagg 
atttatatag 
caaagttaaa 
acacagacaa 
tacttcttca 
tgtattatct 
cacctgtctt 
agggagtgta 
ccacctgccg 
gcaatgtttt 
atgctctaga 
ttgatacatt 
tagtatatat 
aaggtggctc 
tactccactt 
gggggcgcag 



aaaagtttga 
aaagtgacaa 
atgaagaaag 
ggatttcctc 
agtgagatag 
atatctgaag 
caagcaaaga 
ggaactggga 
caagacagga 
gcaaatcaag 
tatggtggaa 
gttggaacac 
cttaagcatg 
gtggaagaga 
cttttttctg 
acatatgaac 
catctggcta 
gtatatagtg 
gagctgtccc 
cagaagcaaa 
gcaaccaatg 
tctccaccaa 
aaggacgggg 
ggagcaaaaa 
aaaagcttcc 
tcacttcaaa 
agcagcactg 
aaatgtgggt 
tgcttggaaa 
ggtttttctc 
aatacaggag 
agaactggaa 
aggcttcagg 
aggcccgaga 
ccagaagcgg 
caaaaagaga 
agcacattgt 
gcttcattta 
tcagtttttc 
gtagattaga 
tatactcaaa 
acagtctctg 
aacagtttct 
gaaacaagat 
cttggaagat 
tttagcttct 
aaaagtacat 
catagcttta 
cttcctattg 
attagcattg 



gaaagaaaaa 
aaaatgagga 
aaaaggaaat 
atcctgaacc 
agcaggaaat 
aaactattaa 
cattccatca 
agacattctc 
agagaggccg 
taagcaaaga 
ctccctatgg 
caggtcgtat 
ttgtcctgga 
ttttaagtgt 
caacttgccc 
aggtggacct 
ttaagtgcca 
gtcatcaagg 
agaattcagc 
gggaaatcac 
ttgctgcacg 
agggatgtag 
gtgtgcatct 
gcgggaatta 
agcaaagatg 
caatcagctg 
gcccatattt 
tttgtgacca 
gaacttaaag 
aaaggaaagc 
aaatggcatg 
ggaccacggg 
ggacagcggg 
ggacagcgat 
agtttcagta 
atgatgtttg 
gcctcctttt 
aattatttca 
cttttgaaag 
agataaaatc 
agtgtccctt 
gaggaccact 
catgtggtcc 
ttcaaactaa 
gtagtatgtt 
cattataagg 
tttaatagaa 
tttgtaagta 
gaagattaac 
ctcaagagta 



ggagccdatftf 
gccttctgag 
gaatggagaa 
ggactgtaac 
acctgtggaa 
acttctcaaa 
tgtttacagc 
ctttgccatc 
tgcccctcag 
cttcagtgac 
aggtcaattt 
caaagaccac 
tgaagtggac 
ggcatacaag 
tcattgggta 
gattggtaaa 
ctggactcag 
acgcactatc 
tataaagcag 
cctgaaaggt 
tgggttagac 
agtcctacat 
gcttttatca 
agttcaaacg 
ccatcaggct 
agaagctgat 
caggtgccac 
tgatcttgca 
agcagctggg 
tgggtgtttg 
attcacgacg 
aaggatatgg 
acggaaacag 
caggaggtgg 
aagcatttgg 
gcaatataga 
gaccacttgc 
tctgatcatt 
gtgtatgaat 
aagcatgtat 
aatagtgtcc 
ttgagccttt 
tattatttgt 
tctgggttgt 
tgatgtggat 
tgattcatgc 
agccagggtt 
ggctggataa 
attatttacc 
tgt 



ga'aaTa'c(a1aa§' 

gaagaaatag 

actagagaga 

cccagtgaag 

caaaaagaag 

ggccgaggag 

gggaaggact 

cctttgattg 

gtactggttc 

atcacaaaaa 

gaacgcatga 

atacagaatg 

cagatgttgg 

aaagattctg 

tttaatgttg 

aagactcaga 

agggcagcag 

atcttttgtg 

gatgctcagt 

tttagaaatg 

atccctgoG^g 

tcatcgatcc 

gcacaaggaa 

aataggtgtt 

tttggattcc 

agaggagaag 

gtccgtagac 

gtgctcaatt 

cgaggagatt 

ctttgatgta 

ctggcagctc 

aggcttcagg 

aagattcaga 

caacaaaagt 

tcaataatta 

actgaacatt 

caagtccctg 

atcatttata 

tcattacttt 

ctgcctatac 

ttccctgaaa 

ggaagttaag 

ctactgagac 

aatacagttt 

tacctatact 

tttagtgaat 

ttaaggaatt 

atggtgctta 

aagaaggact 



tggtttet?t?c 

atgctcctaa 

aaagccccaa 

ctgccagtga 

gcgctttctc 

tgaccttcct 

taattgcaca 

agaaacttca 

ttgcacctac 

agctgtcagt 

ggaatgggat 

gcaaactaga 

atatgggatt 

aagacaatcc 

ccaagaaata 

aaacggcaat 

ttattgggga 

aaaccaagaa 

ccttgcatgg 

gtagttttgg 

ttgatttggt 

gggcggacag 

gaatatcagt 

ccttctgcaa 

gtgcctccca 

ggagctgtgg 

cagcgctcct 

gaaatgccaa 

gattccaaag 

cctaccgcat 

tctgtggcca m 

ggacagcggg 

ggacagcggg 

aacagatccc 

gaaatagaag 

atttttcatg 

tctctttcag 

actttattgt 

tttattctaa 

tttgtgagtt 

taaataccta 

gtttcctcag 

ttaatactga 

ataccagtgt 

tatgttcgtt 

tcttcataga 

tcacatgtat 

aatggtaatg 

taagggagta 
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BASE COUNT 



XM_030607 2005 bp mRNA linear PRI 16-OCT-2001 

Homo sapiens serine/threonine kinase 15 (STK15) , mRNA. 

XM_030607 

XM_030607 .1 GI : 14786409 * nwjr^% 

AKJvZ 

human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata?; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 2005) 
NCBl Annotation Project. 
Direct Submission 

Submitted (ll-OCT-2001) National Center for Biotechnology 
Information, NIH, Bethesda, MD 20894, USA 

Location/Qualifiers 

1. .2005 

/organism^ "Homo sapiens" 
/db_xref="taxon: 9606" 
/ chromosome^ " 20 " 
1. .2005 
/gene="STK15" 

/ note = " BTAK ; Located on Accession NT__011362" 

/db_xref = "LocusID : 8465 " 

/db xrefs"MIM: 603072 " 

26. . 1237 

/gene= M STK15 M 

/notes "Located on Accession NT_011362" 
/codon_startsl 

/product^ "hypothetical protein XP_030607 n 
/protein_id= " XP 030607.1 " 
/db_xref="GI:14786410" 

/translations "MDRSKENCISGPVKATAPVGGPKRVLVTQQFPCQNPLPVNSGQA 
QRVLCPSNSSQRIPLQAQKLVSSHKPVQNQKQKQLQATSVPHPVSRPLNNTQKSKQPL 
PS APENNPEEELAS KQKNEES KKRQWALEDFE IGRPLGKGKFGNVYLAREKQS KF ILA 
LKVLFKAQLEKAGVEHQLRREVE IQSHLRHPNI LRLYGYFHDATRVYLILEYAPLGTV 
YRELQKLSKFDEQRTATYITELANALSYCHSKRVIHRDIKPENLLLGSAGELKIADFG 
WSVHAPSSRRTTLCGTLDYLPPEMIEGRMHDEKVDLWSLGVLCYEFLVGKPPFEANTY 
QETYKRISRVEFTFPDFVTEGARDLISRLLKHNPSQRPMLREVLEHPWITANSSKPSN 
CQNKESASKQS " 
422. .1174 

/note="S_TKc; Region: Serine/Threonine protein kinases, 
catalytic domain" 
422. .1174 

/note="pkinase; Region: Protein kinase domain" 
425. .1162 

/note=«TyrKc; Region: Tyrosine kinase, catalytic domain" 
585 a 434 c 456 g 530 t 
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1 cttgggtcct tgggtcgcag gcatcatgga ccgatctaaa gaaaactgca tttcaggacc 
61 tgttaaggct acagctccag ttggaggtcc aaaacgtgtt ctcgtgactc agcaatttcc 
121 ttgtcagaat ccattacctg taaatagtgg ccaggctcag cgggtcttgt gtccttcaaa 
181 ttcttcccag cgcattcctt tgcaagcaca aaagcttgtc tccagtcaca agccggttca 
241 gaatcagaag cagaagcaat tgcaggcaac cagtgtacct catcctgtct ccaggccact 
301 gaataacacc caaaagagca agcagcccct gccatcggca cctgaaaata atcctgagga 
361 ggaactggca tcaaaacaga aaaatgaaga atcaaaaaag aggcagtggg ctttggaaga 
421 ctttgaaatt ggtcgccctc tgggtaaagg aaagtttggt aatgtttatt tggcaagaga 
481 aaagcaaagc aagtttattc tggctcttaa agtgttattt aaagctcagc tggagaaagc 
541 cggagtggag catcagctca gaagagaagt agaaatacag tcccaccttc ggcatcctaa 
601 tattcttaga ctgtatggtt atttccatga tgctaccaga gtctacctaa ttctggaata 
661 tgcaccactt ggaacagttt atagagaact tcagaaactt tcaaagtttg atgagcagag 
721 aactgctact tatataacag aattggcaaa tgccctgtct tactgtcatt cgaagagagt 
781 tattcataga gacattaagc cagagaactt acttcttgga tcagctggag agcttaaaat 
841 tgcagatttt gggtggtcag tacatgctcc atcttccagg aggaccactc tctgtggcac 
901 cctggactac ctgccccctg aaatgattga aggtcggatg catgatgaga aggtggatct 
961 ctggagcctt ggagttcttt gctatgaatt tttagttggg aagcctcctt ttgaggcaaa 
1021 cacataccaa gagacctaca aaagaatatc acgggttgaa ttcacattcc ctgactttgt 
1081 aacagaggga gccagggacc tcatttcaag actgttgaag cataatccca gccagaggcc 
1141 aatgctcaga gaagtacttg aacacccctg gatcacagca aattcatdaa aaccatcaaa 
1201 ttgccaaaac aaagaatcag ctagcaaaca gtcttaggaa tcgtgcaggg ggagaaatcc 
1261 ttgagccagg gctgccatat aacctgacag gaacatgcta ctgaagttta ttttaccatt 
1321 gactgctgcc ctcaatctag aacgctacac aagaaatatt tgttttactc agcaggtgtg 
1381 ccttaacctc cctattcaga aagctccaca tcaataaaca tgacactctg aagtgaaagt 
1441 agccacgaga attgtgctac ttatactggt tcataatctg gaggcaaggt tcgactgcag 
1501 ccgccccgtc agcctgtgct aggcatggtg tcttcacagg aggcaaatcc agagcctggc 
1561 tgtggggaaa gtgaccactc tgccctgacc ccgatcagtt aaggagctgt gcaataacct 
1621 tcctagtacc tgagtgagtg tgtaacttat tgggttggcg aagcctggta aagctgttgg 
1681 aatgagtatg tgattctttt taagtatgaa aataaagata tatgtacaga cttgtatttt 
1741 ttctctggtg gcattccttt aggaatgctg tgtgtctgtc cggcaccccg gtaggcctga. 
1801 ttgggtttct agtcctcctt aaccacttat ctcccatatg agagtgtgaa aaataggaac 
1861 acgtgctcta cctccattta gggatttgct tgggatacag aagaggccat gtgtctcaga 
1921 gctgttaagg gcttattttt ttaaaacatt ggagtcatag catgtgtgta aactttaaat 
1981 atgcaaataa ataagtatct atgtc 

// 
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CDS 



BC008442 1584 bp mRNA linear PRI 12-JUL-2001 

Homo sapiens , Similar to transmembrane 4 superfamily member 1, 
clone MGC: 14656 IMAGE : 4 101110 , mRNA, complete : cds . 
BC008442 

BC008442 .1 GI:14250074 
MGC. 
human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 1584) 
Strausberg , R . 
Direct Submission 

Submitted (25-MAY-2001) National Institutes of Health, Mammalian 
Gene Collection (MGC), Cancer Genomics Office, National Cancer 
Institute, 31 Center Drive, Room 11A03, Bethesda, MD 20892-2590, 
USA 

NIH-MGC Project URL: http://mgc.nci.nih.gov 
Contact: MGC help desk 
Email : cgapbs -r@mail . nih - gov 
Tissue Procurement: ATCC 

cDNA Library Preparation: CLONTECH Laboratories, Inc. 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Sequencing Group at the Stanford Human Genome 

Center, Stanford University School of Medicine, Stanford, CA 94305 

Web site: http://www-shgc.stanford.edu 

Contact: (Dickson, Mark) mcd@paxil.stanford.edu 

Dickson, M. , Schmutz, «J. , Grimwood, J., Rodriquez, A., and Myers, 
R. M. 

Clone distribution: MGC clone distribution information can be found 
through the I.M.A.G.E. Consort ium/LLNL at: http://image.llnl.gov 
Series: IRAL Plate: 21 Row: 1 Column: 7 

This clone was selected for full length sequencing because it 
passed the following selection criteria: Similarity but not 
identity to protein. 

Location/Qualifiers 

1. .1584 

/organism= "Homo sapiens" 
/db_xr e f = » t axon : 9606 
/clone="MGC: 14 656 IMAGE : 4101110 " 

/tissue_type= n Bone marrow, chronic myelogenous leukemia" 

/clone_lib="NIH_MGC_54" 

/lab_ho6t= w DH10B" 

/note= "Vector : pDNR- LIB * 

102. .710 
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/codon Btart=l r 

/product- -Similar to transmembrane 4 superfamily member l" 

/protein_id« B MH08112^l w 

/db xref="GI:14250075" 

/trInslation= w MCYGKCARCIGHSLVGLALLCIAANILLYFPNGETKYASENHLS 
RFVWFFSGIVGGGLLMLLPAFVFIGLEQDDCCGCCGHENCGKRCAMLSSVLAALIGIA 
GSGYCVIVAAI/3IJ\£GPLCLDSIjGQWNYTFASTEGQYLLDTSTWSECTEPKHIVEWNV 

S L FS I LLJUjGG I EF I LCL I QVINGVLGG I CG FCCSHQQQ YDC " 



BASE COUNT 460 a 311 C 337 g 

ORIGIN 

l gtggtgtttg ctttctccac cagaagggca 
61 ctgaagacaa agagaagggg gagaaaacct 
121 cacgatgcat cggacattct ctggtggggc 
181 tgctttactt tcccaatggg gaaacaaagt 
241 tgtggttctt ttctggcatc gtaggaggtg 
301 tcattgggct ggaacaggat gactgctgtg 
361 gatgtgcgat gctttcttct gtattggctg 
421 gtgtcattgt ggcagccctt ggcttagcag 
481 agtggaacta cacctttgcc agcaccgagg 
541 ccgagtgcac tgaacccaag cacattgtgg 
601 tggctcttgg tggaattgaa ttcatcttgt 
661 gaggcatatg tggcttttgc tgctctcacc 
721 ccaggacaga gccacaatct tcctctattt 
781 atttgtaaaa ctttgtatta gtgtaacata 
841 taaagactgg catcttcaca ggatgtcagt 
901 gtttatttgt ttttgttttt tttttaggaa 
961 ttacagactg agtgacagta ctcagtatat 
1021 aaataacatt ccaatcacta ttgtatatat 
1081 agttgctttt tataagacca agaaggagaa 
1141 cactgcttgt atgatgtttc ccattcatac 
1201 aactgccttg tgttctgtga gaaacaaata 
1261 tgttccaatc caaatgaatg catcacaact 
1321 gagattcaaa tttttctaac atatggaaag 
1381 atcatgtgtt taaaaaaaag aaaggctacg 
1441 gaataaagca gttgatcagc atcattggaa 
1501 gaggaaatac cctcaaaact aacttgttta 
1561 aaaaaaaaaa aaaaaaaaaa aaaa 

// 



476 t 

cactttcatc 

agcagaccac 

tcgccctcct 

atgcctccga 

gcctgctgat 

gctgctgtgg 

ctctcattgg 

aaggaccact 

gccagtacct 

aatggaatgt 

gtcttattca 

aacagcaata 

cattgtaatt 

ctccccacag 

gtttaaattt 

tgaggaaaca 

ctgagataaa 

gtgcatgtat 

aatccgacaa 

acctataaat 

tttacttaga 

tacaatgctg 

ccttttgtcc 

atgactgggc 

catggggacg 

caacaaaata 



taatttgggg 
catgtgctat 
gtgcatcgcg 
aaaccacctc 
gctcctgcca 
ccatgaaaac 
aattgcagga 
atgtcttgat 
tctggatacc 
atctctgttt 
agtaataalkt 
tgactgctaa 
tatatatttc 
tctactttta 
agtaaacttc 
aaccaccctc 
ctctataatg 
tttttaaatt 
cctggaaaga 
ctctaacaag 
gtggaaggac 
ctcattgttg 
tccaaagatg 
aagaagaaag 
agtgacggca 
aagtattcac 



tatcactgag 

gggaagtgtg 

gctaatattt 

agccgcttcg 

gcatttgtct 

tgtggcaaac 

tctggctact 

tccctcggcc 

tccacatggt 

tctatcctct 

ggagtgcttg 

aagaaccaac 

acttgtattc 

caaacgcctg 

ttttttgttt 

tgggggtagt 

ttttggataa 

aaagatgtct 

tttttgtttt 

aggccctttg 

tgattgagaa . 

tgagtactat 

agtactaggg 

atgggaaact 

ggaggaccac 

tacgaaaaaa 
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r l: XMJ)27538[gi: 14768648] 



LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



gene 



CDS 



XM 027538 1025 bp mRNA linear PRI 16-JUL-2001 

Homo sapiens excision repair cross -complementing rodent repair 
deficiency, complementation group 1 (includes overlapping antisense 
sequence) (ERCC1) , mRNA. 
XM_027538 

XM_027538.1 GI:14768648 ERCC1 
human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomx; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 1025) 
NCBI Annotation Project. 
Direct Submission 

Submitted (12-JUL-2001) National Center for Biotechnology 
Information, NIH, Bethesda, MD 20894, USA 

Location/Qualifiers 

1..1025 

/organism="Homo sapiens" 

/db_xref="taxon: 9606" 

/chromosome= "19" 

1..1025 

/gene="ERCCl" 

/note="UV20" 

/db_xref = " Locus ID : 2067" 

/db_xref = n MIM : 126380 " 

63. .956 

/gene= n ERCCl" 

/ codon_s t ar t = 1 

/product= "excision repair cross -complementing rodent 
repair deficiency, complementation group 1 (includes 
overlapping antisense sequence) n 
/protein_id= " XP 027538.1 " 
/db xref="GI : 14768649" 

/trans la tions="MDPGKD KEG VPQPSGPPARKKFVIPLDEDEVPPGVAKPLFRSTQ 
SLPTVDTSAQAAPQTYAEYAISQPLEGAGATCPTGSEPLAGETPNQALKPGAKSNSII 
VSPRQRGNPVLKFVRNVPWEFGDVIPDYVLGQSTCALFLSLRYHNLHPDYIHGRLQSL 
GKNFAIjRVTjLVQVDVKDPQQALKELAKMCILADCTLILAWSPEEAGRYLETYKAYEQK 
PADLLMEKLEQDFVS RVTECLTTVKS VNKTDS QTLLTTFGS LEQL I AAS REDLALiC PG 
LGPQKARRLFDVLHEPFLKVP 1 



234 a 



326 c 



289 g 



BASE COUNT 
ORIGIN 

1 ccaagaccag caggtgaggc ctcgcggcgc 
61 agatggaccc tgggaaggac aaagaggggg 
121 agaaatttgt gatacccctc gacgaggatg 
181 tccgatctac acagagcctt cccactgtgg 
241 acgccgaata tgccatctca cagcctctgg 



176 t 

tgaaaccgtg 
tgccccagcc 
aggtccctcc 
acacctcggc 
aaggggctgg 



aggcccggac 
ctcagggccg 
tggagtggcc 
ccaggcggcc 
ggccacgtgc 



cacaggctcc 
ccagcaagga 
aagcccttat 
cctcagacct 
cccacagggt 
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301 
361 
421 
481 
541 
601 
661 
721 
781 
841 
901 
961 
1021 
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// 



cagagcccct 
gcatcattgt 
cctgggaatt 
tcctcagcct 
tggggaagaa 
aggccctcaa 
ggagccccga 
cggacctcct 
ccaccgtgaa 
tggaacagct 
agaaagcccg 
cccagctgcc 
ctggc 



ggcaggagag 
gagccctcgg 
tggcgacgta 
ccgctaccac 
cttcgccttg 
ggagctggct 
ggaagctggg 
gatggagaag 
gtcagtcaac 
catcgccgca 
gaggctgttt 
aaggaaaccc 



acgcccaacc 
cagaggggca 
attcccgact 
aacctgcacc 
cgggtcctgc 
aagatgtgta 
cggtacctgg 
ctagagcagg 
aaaacggaca 
tcaagagaag 
gatgtcctgc 
ccagtgtaat 



aggccctgaa 
atcccgtact 
atgtgctggg 
cagactacat 
ttgtccaggt 
tcctggccga 
agacctacaa 
acttcgtctc 
gtcagaccct 
atctggcctt 
acgagccctt 
aataaatcgt 



acccggggca" 
gaagttcgtg 
ccagagcacc 
ccatgggcgg 
ggatgtgaaa 
ctgcacattg 
ggcctatgag 
ccgggtgact 
cctgaccaca 
atgcccaggc 
cttgaaagta 
cctcccaggc 



"a'aatrccaaca 
cgcaatgtgc 
tgtgccctgt 
ctgcagagcc 
gatccccagc 
atcctcgcct 
cagaaaccag 
gaatgtctga 
tttggatctc 
ctgggccctc 
ccctgatgac 
caggctcctg 
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SEQ ID NO: 29 
Size: X81 
DMA FANCA 

CCAGTGTG CTGGAAAGGAGGAAGATATCCTGGCTGG CACTCTTTCAGTTG AC AG AG AGTG AC CTC AGGCTGGGGC 
GGCTCCTCCTCCGTGTGGCCCCGGATCAGCACACCAGGCTGCTGCCTTTCGCTTTTTACAGTCTTCTCTCCTACT 
TCCATGAAGACGCGGCTTTCCAGCACAGTGG 

SEQ ID NO: 30 
Size: 603 
DNA DDX9 

CCAGTGTGCTGGAAAGCGCCACCTCCTCTTCCCTGTCCAAAGTAGCCAGTTCCATAGGCCCCCCTACCACCWCCT 
CGCTGGAATCCCCCAGATCCTCTGTAGCCTCCACTAGGCCCTCTGTAGTCTCCTCCAGAGTTGCCTCTAAAGCCA 
CCTCGGGAGACTCCTCTATAGCCTCCACCAACACCTGCACCATATCCTGCCCGAAAGGAGTTGGCGCTGCCACCA 
TAGCCTCCGCTACCATAGCCTCCACTGCTATAGCCACCGCATAGCCTCCACCACTGTAACTAGAACCTCCCCTTC 
TATATCCGCTTCCATTGTCGTATCGGGCCATCTTGGGAGGACGTGGACCATCTCCATGCCGTGTACTGCCAATCA 
TAAGGTTGATACCAGCAGCTGAGGGTCTAGAGATCrc^^ 

ACTGGCTGATGATAGCAGGTTGTTTGGTTACTTCAACAACCAAAGCCTCCATGGCTGCCCGGAGACCAGTGATAC 
AGGCAGCAGCTTCATGAGATATTTGCAGTTTAATCCAGTCATCTACAAGCACAATCTGCCCACTTTCCAGCACAG 
TGG 

SEQ ID NO: 31 
Size: 145 
DNA IGF1R 

CCAGTGTGTTGGAAAGGGAGAGAACTGTCATTTCTAACCTTCGGCCTTTCACATTGTACCGCATCGATATCCACA 
GCTGCAACCACGAGGCTGAGAAGCTGGGCTGCAGCGCCTCCAACTTCGTCTTTGCTTTCCAGCACAGTGG 

SEQ ID NO: 32 
Size: 269 
DNA UBEV2V1 



CCAGTGTGCTGGAAAGGTGCTTCTGGGTATTTAGGTCCACATTCTATTTTAAGGCTGTATATTCGGTTTTCATAA 
ATTGTTCTTGGAGGCCCAATTATCATCCCTGTCCATCTTGTAAGATGTCATGTCTTCGTCATCTTCTAGACCCCA 
GCTAACTGTGCCATCTCCTACTCCTTTCTGGCCTTCTTCGAGATTCCTCCAACAGTCGGAAATTGCGAGGGACTT 
TATACATCCCGAGCCCGTGGTGGCTGCCCTTTCCAGCACACTGG 



SEQ ID NO: 33 
Size: 499 

DNA aldehyde dehydrogenase 

CCAGTGTGCTGGAAAGGAGCAAACTCCTCTCACTGCTCTCCACGTGGCATCTTTAATAAAAGAGGCAGGGTTTCC 
TCCTGGAGTAGTGAATATTGTTCCTGGTTATGGGCCTACAGCAGGGGCAGCCATTTCTTCTCACATGGATATAGA 
CAAAGTAGCCTTCACAGGATCAACAGAGGTTGGCAAGTTGATCAAAGAAGCTGCCGGGAAAAGCAATCTGAAGAG 
GGTGACCCTGGAGCTTGGAGGAAAGAGCCCTTGCATTGTGTTAGCTGATGCCGACTTGGACAATGCTGTTGAATT 
TGCACACCATGGGGTATTCTACCACCAGGGCCAGTGTTGTATAGCCGCATCCAGGATTTTTGTGGAAGAATCAAT 
TTATGATGAGTTTTGTTCGAAGGAGTGTTGAGCGGGCTAAGAACGTATATCCTTGGAAACATCCTCTGACCCCAG 
GAGTCACTCAAAGGCCCTCAGATTGACAAGGACTTTCCAGACACAGTGG 

SEQ ID NO: 34 
Size: 425 

DNA pyruvate kinase 
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CCAGTGTGCTGGAAAGGCTGCCCACTTCCACCACCTTGCAGATGTTCTTGTAGTCCAGCCACAGGATGTTCTCGT 
CACACTTTTCCATGTAGGCGTTATCCAGCGTGATTTTGAGAGTGGCTCCCTTCTTCAGCTCCACCTCTGCAGTGC 
CGCTGCCCTTGATGAGCCCAGTTCGGATCTCAGGTCCTTTAGTGTCTAGAGCCACAGCAACGGGCCGGTAGAGGA 
TGGGGTCAGAAGCAAAGCTTTCCGTGGCTGTGCGCACATTCTTGATGGTCTCCGCATGGTACTCATGAGTTCCAT 
GAGAGAAGTTCAGACGAGCCACATTCATTCCAGACTTAATCATCTCCTTCAACGTCTCCACTGGATCGGGAAGCT 
GGGCCAATGGTACAGATGATGCCAGTGTTCCGGGCTTTCCAGCACAGTGG 

SEQ ID NO: 35 



CCAGTGTGCTGGAAACTTTCCAGTTCTCCATGGCCACCANACACAGCATCTGCAGTAGGTGGTTCTGCATCACGT 
CCCGGATGATCCCAAATTCATCGAAATAGCCCCCGCGACCCTCAGTGCCAAAGGGCTCCTTGAAGGTGAGGATAA 
CGCAGGCGATGTTGTCCCGGTTCCANATGGGGCCGAAGATCCTGTTGGCAAATCTCAGCACCATGAGGTTCTCTT 
TCCAGCACAGTGG 



Size : 
DNA G6PD 
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